NWeekly tasks: Initial Approaches to consider:
The following is a description of what I will need to do in the next week about getting some prelim work done.
- Obtaining data. This data needs to be the TDT3 or TDT4 corpus for good comparison with the literature. Further I need to get data from the Vision Bytes in the form of tagged (or partial) tagged software. Most likely this should be done from the DB rather then manual extraction (so onsite).
- A preliminary clustering algorithm that will cluster in real time. In order to do this i will first need some feature extraction. That is get the features f the data and place them into some Vector space or language model.
- A system that will perform first story detection and track (ranked) the stories.
- Write up background, literature review, requirements and initial work for draft.