
|
|
A Text File Splitter user had a problem with the file encodings. This is a known issue, that I'm happy to report is almost resolved. He gave me a file for testing, and that has helped enormously. Thanks Zhou!
Work on Text File Splitter 3.0 has crawled, but I had already converted the code from 2.2.1 over to .NET 4.0. I decided to just finish this work, and release it as 2.5.0.
Here's a screenshot of Text File Splitter detecting a UTF-8 file without the Byte Order Mark (BOM):

Here's a screenshot of a file with a BOM:

I'm using a library called "ude", which is a C# port of the Mozilla Universal Charset Detector. I had to put a bug fix to deal with very large files. At least the file encoding detection, first half of this feature, is now done. Now I need to deal with the encoding on the file chunks. This has taken a lot more time and code than I expected. Hopefully, this will solve this nagging issue once and for all.
I don't have a date for when this version will be released. I still need to update pages in the new wiki (http://docs.systemwidgets.com). You guys will be able to start creating your own splitting strategies, once I get all of this work done. The wiki talks about version 3.0, but you will be able to do this with version 2.5.
Previous Page | Next Page