Various strategies can be employed by academics to police plagiarism, ranging from simple Web search techniques used by individual lecturers, to the employment of easy-to-use freeware capable of tracking plagiarism between cohorts of students, as well as to quite elaborate systemic approaches involving the engagement of commercial plagiarism detection agencies.
While it is possible to detect plagiarism if assignments are submitted in hard copy, if these same assignments are submitted electronically, the whole process is made considerably easier because academics are in a position to 'fight fire with fire'. Armed with a good search engine, a lazy or accidental plagiarist can be exposed relatively easily. The first thing to do is to identify some suspicious-looking text: a paragraph with a discernible change in writing style, for example. Then, by copying a word string of, say, five to six words from this paragraph, enclosing in a set of inverted commas, and pasting into a search engine (for example, http://www.google.com/, http://www.dogpile.com/ or http://www.mywebsearch.com/), a list of potential source documents for the word string will be generated in the space of a few seconds. By clicking on one of the websites listed, it may be possible to locate the passage one is looking for with the naked eye. In the event that this is not possible, use the text copied from the student's essay, which is still stored on the computer clipboard, and paste it into the 'Find on This Page' option in the Edit menu of the Web browser: the precise location of the offending text on the website in question will be highlighted.
Suspicious-looking text need not be restricted to an inexplicable change in a student's writing style: the usual tell-tale sign in the case of the lazy or accidental plagiarist. It is possible to trip up the cunning plagiarist by running checks on the consistency of formatting in their essay. US English intermingled with UK English, font and text size changes, and carriage returns (↵) or paragraph symbols (¶) at the end of each line are usually quite reliable indicators of cut-and-paste activity. More subtle examples include the style of inverted commas, and the use of special characters like 'en dashes'. The inclusion of 'straight quotes' (" ") and consecutive hyphens (--) where you would normally expect to find ‘curly quotes’ (“ ”) and en dashes (–) may be also be a sign that there have been extensive cut-and-pastes from HTML documents.
Sometimes, extensive Web searches for evidence of plagiarism yield a nil return when all the indications suggests otherwise. In this instance, the process utilised for searching the Web needs to be repeated but, this time, to be done in conjunction with online databases (for example, ProQuest and EBSCO Host). Requiring a university library login, documents in these databases may not show up on the Web, and they are likely, therefore, to escape the clutches of a Google search. If this strategy fails to produce results, and the evidence suggesting plagiarism is compelling, then the text used may not be digitised (which obviously makes detection more difficult), or it could be that the student has been able to access the work of a colleague. In this instance, a program that compares the work of one student against that of another, both within and across cohorts, is of considerable assistance. One such program, available as freeware, has been devised by Lou Bloomfield at Virginia University (see section 4.1).
If academics find the policing of plagiarism too labour-intensive, there is always the option of out-sourcing the task to private enterprise. Indeed, almost as quickly as student cheat sites arrived on the scene, electronic student cheat detection services emerged to counter them. One company, Turnitin.com, has become very successful in this and is the leading supplier of plagiarism detection services in the world today, claiming to have a client base of more than 3,500 educational institutions in over 50 countries, including the UK where it services more than 700 institutions of further and higher education through JISC. For its source documents, Turnitin.com uses the current and an extensively archived copy of the publicly accessible internet, which includes more than 4 billion pages, updated daily at a rate of around 40 million pages per day. In addition, it uses millions of published works, including tens of thousands of electronic books, and journals accessible via ABI/Inform, Periodical Abstracts and Business Dateline. It also uses all the student papers that have ever been submitted to Turnitin.com.
Enlisting private companies as agents of the university to act in the capacity described above may clearly be a risky enterprise, and it is one that requires very careful consideration (especially as, in the early days, it was discovered by Lou Bloomfield that some plagiarism detection sites shared domain registrations and servers with several paper mills!). The British have been far more cautious than their North American counterparts in this respect: the detailed feasibility studies under the aegis of JISC provide evidence of this. In other parts of the world, such as Australasia, there has been even greater caution, perhaps reflecting the relative immaturity of the plagiarism detection industry and the general conservatism of educational institutions, although possibly also the cultural differences referred to earlier. This may change as plagiarism detection services become more established and develop a reputation, but then there is always the question of cost. In the UK, for example, the JISC has committed to fully funding its online plagiarism detection service until August 2005, at which time a sliding scale of charges will be introduced.