Skip to content

Day 9 @ KCS

April 18, 2007

I finally got something to do!!!

I was moved to another project to help converting existing data to the new system. There’s a text field which the user could put any text in any format into it. This causes the problem as the same entity may be represented by different textual descriptions. There’re also typos which make it even more complicated. And the biggest problem is that data is in Thai, one of the complex languages.

At first, I tried to discover patterns by eyes. But I gave up after 4 or 5 hours without any progress. I turned to programatic method. I thought that graph or tree would help in some way. But how could I extract nodes? I googling for a while and find out that Java has built in word breaking class. It enlightened me. It’s a very good news.

After I had tried it for a while, I got this conclusion: eventhough it’s far from perfect, it’s much more better than having nothing.

That’s what I did for the day. The first day I didn’t waste most of the time!

Advertisements
No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: