Bug#663916: New phonetisaurus package available
Il 20/10/2012 22:55, Jakub Wilk ha scritto:
> * Giulio Paci <firstname.lastname@example.org>, 2012-10-20, 00:00:
>> I just had a look to the already opened bugs and I found that there is an RFP bug for utfcpp:
>> Do you think I should do anything else (e.g., reply to the bug with the maintainers of the packages you identified in CC)?
> I think reply+cc would be a good idea, but I won't insist.
> If I run phonetisaurus-align without arguments, it segfaults:
> | $ phonetisaurus-align
> | Loading input file:
> | Starting EM...
> | Finished first iter...
> | Iteration: 1 Change: nan
> | Iteration: 2 Change: nan
> | Iteration: 3 Change: nan
> | Iteration: 4 Change: nan
> | Iteration: 5 Change: nan
> | Iteration: 6 Change: nan
> | Iteration: 7 Change: nan
> | Iteration: 8 Change: nan
> | Iteration: 9 Change: nan
> | Iteration: 10 Change: nan
> | Iteration: 11 Change: nan
> | Last iteration:
> | Segmentation fault
> The manpage seems to imply that --input and --ofile options are mandatory, so I'm not sure what it is even trying to do... But it certainly shouldn't segfault.
--input is mandatory indeed. I added a patch to prevent segfaults. The message is not very clear, but I hope is enugh.
> Shouldn't phonetisaurus-align input format be documented somewhere? BTW, it aborts without any helpful error message if the input file is not valid:
> | $ echo foobar > invalid.txt
> | $ phonetisaurus-align --input=invalid.txt --ofile=invalid.corpus
> | Loading input file: tiny.bsf
> | terminate called after throwing an instance of 'std::out_of_range'
> | what(): vector::_M_range_check
> | Aborted
I documented the format in the manpage. The input format is very generic and it would be probably difficult to detect invalid input files.
The error above is because the program expects a two columns file and only one column was provided. Now this error is not reported anymore (in my opinion one column files
are still valid input, but I am waiting confirm from upstream).
> What does --fst_field_separator exactly do? In my experiments it did not affect phonetisaurus-align in any way.
Unfortunately I do not know most of the options of this program. I asked upstream about this option.