Added basic and full editions compiled for .NET Framework 3.5 Client Profile. It enables using NTextCat within SQL Server 2008.
There are two edition types: basic and full.
Basic edition contains original textcat's language models (which have poor encoding coverage) + "Wikipedia-Experimental-UTF8Only" (which are capable of identifying language of UTF8 text only or of string if you use ClassifyText from API)
Full edition contains all files of basic edition + FULL language pack -- "Wikipedia-Experimental-AllEncodings" -- 27263 lanugage models (280 languages and flavors of wikipedia encoded in all encodings capable of representing at least 90% of sample text).
PLEASE BEWARE OF AROUND 40 SECONDS DELAY BEFORE APPLICATION SHOWS PROMPT WHEN YOU START IT FOR THE FIRST TIME.
This happens because of huge number of language models loaded (27263).
Delay is around 15 seconds for the second time you start application (because all files will be cached already).
Full matrix of language-encoding compatibility can be found in languageEncodingMatrix.csv (pairs with values of >90% are included into release).
Please find sample material in Evaluation folder (some languages I know and popular encodings).
Example of usage (default settings used):
NTextCatLegacy.exe -noprompt < Evaluation\ukrainian-1251.txt
First result returned is considered the best. Format is <lanugage>_cp<codepage>". E.g. uk_cp1251