We have decided on a strategy on how to automatically detect plagiarism. It will be some sort of hybrid of techniques from nearby research areas.
Our aim is to catch plagiarism in a semantical and stylistic way. We have a nice word space model that will be used to capture semantic features of the text and for style recognition we will use techniques from the authorship identification research field.
I will implement two baseline algorithm to be used to measure our results. The first one will be a really naïve one and will act as lower bound that we should never get close too. The second will represent "the state of the art" plagiarism detection tool that we will strive to surpass and will probably be the winner of the 1st International Competition of Plagiarism Detection, namely ENCOPLOT.
Thursday, April 1, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment