Archive for the ‘NLP’ Category

Holy 45,000 pvs and 3,000 new users this week!

February 6, 2007

Yesterday and today have been INSANE on WipBox!!!  45,000+ page views and 3,000 new users in the last 2 days.

There will be a moderate update happening tomorrow night with some solid bugs/usability fixes.  Also there will be a tutorial on what  the graphs do and how to use them best. I’ve also wrangled  some resources to help get BIGGER releases done to and make it way more usable and friendly.

Please keep sending in bugs, though. I’ve been putting them into the bug tracker and have all of them slated to be addressed as soon as possible.

Also, please be patient.  I originally created this for myself to use in less than a month.  For me it works great.  But, it’s becoming VERY clear that, as more people have been signing up, I’m finding  things need to be made more usable.


1,000+ Downloads of my Python POS NLP Tagger

January 17, 2007

Well with less than 4 days to go until the 1yr anniversary of the release of my Simple Python Part-of-Speech (POS) NLP Tagger, it’s exceeded 1,000 downloads. That also marks the day I learned Python as I was getting my tires changed at Costco and had 6 hours to kill. Printed out the tutorials and took them with me. Got home, figured…let’s port Mark’s C# tagger to Python. That’d be a great way to figure things out. And voila! 1,000 downloads later, it seems to have been the right idea.

Here’s the link for those who wanna play with it.

If anyone needs anything special added to it, let me know. I’m loving LAMP+Python in general and dig excuses to make things better.

Oh yeah, I guess that also means my tires are a year old now too…drat!

This is a sweet use of my NLP Tagger!

December 31, 2006

I came across NodeBox tonight. It’s a sweet 2D visualizer (for the Mac) that’s using my Python POS tagger.

Similar in approach to an early versions of one of my lexical energy packages, these guys have fused WordNet with a (my) POS tagger to produce a very cool and seems-to-be easily implemented/mashed-up semantic analysis package.

Check out some of the other features within not only their Linguistics library, but some of the other libraries as well.

Excellent Python libraries!

NodeBox application

People liking the Python NLP Part-of-Speech tagger…

February 9, 2006

Being up for just under 3 weeks, the NLP part-of-speech tagger I ported to Python has hit the 100th download.  Not bad for my first Python attempt.  Thanks everyone (or at least 100 of ya :)).

Simple NLP Part-of-Speech tagger in Python

January 20, 2006

So yesterday, I decided to learn Python. Been a .NET guy primarily for the last n years, had some people work in it around me, but never was inclined to try it out. DUH!!!! Such a nice language. It took a couple minutes to get my bearings, but I figured…why not! Everyone in the Valley is so anti-MS and so pro-(Python, MySQL, PHP) one needs to embrace the flow.

For the last couple years I’ve been using a very simple, yet (what I believe to be) a strong POS tagger built by Mark Watson and based on Eric Brill’s work. Written in C#, it gave me a very straightforward paring knife to do tokenization and POS tagging quickly and easily in .NET. Now Monty Tagger and NTLK are definitely incredible resources for NLP in Python, but I wanted something very strightforward and portable without all the bells and whistles so I can build on the core myself. Not to mention I wanted something fun for my first outting in Python. Well…ta da! Here it is.

It’s comprised of two (count them 2) VERY simple source files. The first is the basic hashing and pickling utility if you want to make changes to the lexicon (I believe I’m using the same lexicon file as Monty Tagger), and the second is the actual tagger/tokenizer.

I’ve made some additional tweaks to the versions I run and plan to port some of them also to Python. If you’re intersted in additions add a comment and I’ll do my best to share/accomodate.

You can download my Python NLP Part-of-Speech Tagger here.

This is my first anything outside of some Hello World stuff in Python. It definitely works, and does so at a decent clip (speed wise), but I’m sure I could have done some of the operations a little more elegantly. Leave comments though with recommendations/suggestions/!flames.