Debian, ICU, charlock_holmes – encoding detection for files in ruby

Recently I had a problem with uploading csv files and invalid byte sequences. My original csv files was encoded in UTF-16LE (wtf right? :)), and I needed to use UTF-8. So I’ve searched for some easy solution, maybe this could work:

This wasn’t good idea, so I keep digging and I found some better solution:

Charlock_Holmes gem (https://github.com/brianmario/charlock_holmes)

It uses ICU for detecting file encoding, so firstly I refused to use this, because of third part soft. I rather use ruby pure solutions without any third part dependencies, if you move your site to another server you have to remember about other dependencies.. of course you have test covered for everything right? But on production mode you don’t run your test…

Anyway I’ve decided to use this solution. On Debian you can install

firstly I couldn’t find installed files, so I used own precompilation
(http://www.linuxfromscratch.org/blfs/view/svn/general/icu.html)

Next part is to install charlock_holmes gem – it should be super easy… but it wasn’t I struggled with finding icu4c directory (on debian is… /usr/lib/)

So I installed it

with ruby 2.0.0. (with rvm) everything was smooth, of course I struggled with whole thing few hours until I figured it out how to do this right ;).

When I tested everything:

it worked like a charm. But since ruby 1.9.3 – iconv was deprecated and encoding should be used instead – to use it with new ruby, I’ve changed it to ruby 2.2.2, I bundled all gems and…I forgot to use –with-icu-dir – so after testing Charlock I’ve got:

OK, this should be super ease just uninstall and install and with ruby 2.2.2

I’ve got problem with LoadError… wtf? setting up directory didn’t help, I dug into gem directory and compare ruby 2.0.0 and ruby 2.2.2 charlock_holmes.so had different sizes, I copied this file from ruby-2.0.0 and it worked, so problem was with ruby-2.2.2 installation I looked into

There was

Something went wrong during the installation, so when I’ve changed directory and installed like

It worked for me… and

working like it should be work 😉

Thanks for reading and I hope you found what you were looking for.

Sorry of my English – my first public post in this language ;). tl;dr 😀

 

Rafath Khan

Rafath Khan - Problem Solver - nie ma czegoś takiego jak problem, są wyzwania. Od 1998 związany z Internetem, programista, projektant, twórca serwisów i aplikacji internetowych m.in Shoople.pl, PanShop.pl, UnikalneOpisy.pl. Masz problem? Pomogę CI go rozwiązać.

Może Ci się również spodoba