Karen Spärck Jones | |
|---|---|
Spärck Jones in 2002 | |
| Born | Karen Ida Boalth Spärck Jones (1935-08-26)26 August 1935 Huddersfield, Yorkshire, England |
| Died | 4 April 2007(2007-04-04) (aged 71)[1] Willingham, Cambridgeshire, England |
| Alma mater | University of Cambridge |
| Known for | Term frequency–inverse document frequency |
| Spouse | |
| Awards |
|
| Scientific career | |
| Fields | |
| Institutions | University of Cambridge |
| Thesis | Synonymy and Semantic Classification (1964) |
| Richard Braithwaite[1] | |
| Website | cl |
Karen Ida Boalth Spärck Jones FBA (26 August 1935 – 4 April 2007) was a self-taught programmer and a pioneering British computer and information scientist responsible for the concept of inverse document frequency (IDF), a technology that underlies most modern search engines.[2][3][4][5][6][7] She was an advocate for women in computer science, her slogan being, "Computing is too important to be left to men.[8]" In 2019, The New York Times published her belated obituary in its series Overlooked,[9][10] calling her "a pioneer of computer science for work combining statistics and linguistics, and an advocate for women in the field."[10] From 2008, to recognise her achievements in the fields of information retrieval[11][12] (IR) and natural language processing (NLP), the Karen Spärck Jones Award is awarded annually to a recipient for outstanding research in one or both of her fields.[13][14][15][16]
Early life and education
Karen Ida Boalth Spärck Jones was born in Huddersfield, Yorkshire, England. Her parents were Alfred Owen Jones, a chemistry lecturer, and Ida Spärck, a Norwegian who worked for the Norwegian government while in exile in London during World War II.[17]
Spärck Jones was educated at a grammar school in Huddersfield and then from 1953 to 1956 at Girton College, Cambridge, studying history, with an additional final year in Moral Sciences (philosophy). While at Cambridge, Spärck Jones joined the organisation known as the Cambridge Language Research Unit (CLRU) and met the head of CLRU Margaret Masterman, who would inspire her to go into computer science.[10] While working at the CLRU, Spärck Jones began pursuing her PhD. At the time of submission, her PhD thesis was cast aside as uninspired and lacking original thought, but was later published in its entirety as a book.[18] She briefly became a school teacher[19] before moving into computer science.[20] Spärck Jones married fellow Cambridge computer scientist Roger Needham in 1958.[21][10]
Spärck Jones's mother, Ida Spärck, had fled Norway on one of the last boats out after the German invasion in April 1940, going on to serve the Norwegian government in exile in London throughout the war.[21] This background of displacement and resilience shaped the household in which Spärck Jones grew up. She later kept her mother's Norwegian surname professionally after marrying, stating that "it maintains a permanent existence of your own."[10]
Spärck Jones described her entry into computing as almost accidental. She had been working as a schoolteacher when she began visiting the CLRU out of curiosity about her husband's work. It was Margaret Masterman — whom she later described as "a very strange and interesting woman" — who offered her a research position and drew her fully into the field.[10]
Career
Spärck Jones worked at the Cambridge Language Research Unit from the late 1950s,[21] then at Cambridge University Computer Laboratory from 1974 until her retirement in 2002.
From 1999, she held the post of Professor of Computers and Information.[1] She had been given a permanent position only in 1993, and earlier in her career had been employed on a series of short-term contracts.[10] She continued to work in the Computer Laboratory until shortly before her death. Her publications include nine books and numerous papers. A full list of her publications is available from the Cambridge Computer Laboratory.[22]
Spärck Jones' main research interests, since the late 1950s, were natural language processing and information retrieval. In 1964, Spärck Jones published "Synonymy and Semantic Classification", which is now seen as a foundational paper in the field of natural language processing. One of her most important contributions was the concept of inverse document frequency (IDF) weighting in information retrieval, which she introduced in a 1972 paper.[11][23] IDF is used in most search engines today, usually as part of the term frequency–inverse document frequency (TF–IDF) weighting scheme.[24] In the 1980s, Spärck Jones began her work on early speech recognition systems. In 1982 she became involved in the Alvey Programme[10] which was an initiative to motivate more computer science research across the country.
Significance of inverse document frequency
At the time Spärck Jones was working, most computer scientists were focused on making people adapt to machines — learning precise codes and commands to retrieve information. Spärck Jones was working in the opposite direction: teaching computers to understand human language as it is actually used.[10]
Her 1972 paper introduced the concept of inverse document frequency (IDF) by observing that not all words carry equal informational value. A word like "the" appears in virtually every document and tells a retrieval system almost nothing about what any specific document is about. A rare word like "photosynthesis," by contrast, is highly specific and informative. IDF assigns each word a statistical weight based on how rarely it occurs across a document collection — the rarer the word, the higher its weight.[11] When combined with term frequency (TF), which measures how often a word appears within a single document, the resulting TF–IDF score gives every word a relevance rating that can be used to rank documents in response to a search query.[25]
By 2007, Spärck Jones noted that "pretty much every web engine uses those principles."[10] Her colleague John Tait remarked that "a lot of the stuff she was working on until five or ten years ago seemed like mad nonsense, and now we take it for granted."[10] The 1972 paper remains among the most cited works in information retrieval research, with over 4,500 citations recorded in Google Scholar at the time of her death.[26]
The conceptual foundation of TF–IDF — that word meaning is statistical and contextual — has also informed later developments in machine learning and natural language processing, including transformer-based language models such as BERT.[27]
Impact on artificial intelligence
Even though Spärck Jones' views on artificial intelligence (AI) were rather pessimistic in regard to the perceived limitations of AI in information retrieval,[28] her work in natural language processing, information retrieval, and introducing the concept of inverse document frequency (IDF) contributed to the future technological development of AI. Her statistical and ranking methods shifted the direction of the development of AI towards being more expandable and led by data.
Her work had a more indirect and conceptual impact on AI, compared to the current and direct impact it has had on search engines.
Gender and advocacy
Spärck Jones spent the majority of her career at Cambridge on short-term contracts without permanent employment, a situation she attributed directly to gender. In her 2001 IEEE oral history interview she stated that Cambridge was "in many ways not user-friendly, in the sense of women-friendly."[19] She was frequently the only woman present in professional meetings throughout her career.[29]
She channelled this experience into active advocacy. She was a founding member of the women@cl network at Cambridge's Computer Laboratory, worked on outreach programmes aimed at encouraging girls into computing, and became widely known for her slogan: "Computing is too important to be left to men."[30] She was the first woman ever to receive the BCS Lovelace Medal.[31]
Honours and awards
These include:
- Gerard Salton Award (1988)[32]
- Elected a Fellow of Association for the Advancement of Artificial Intelligence (AAAI) in 1993[33][34]
- President of the Association for Computational Linguistics (ACL) in 1994[34]
- Honorary degree of Doctor of Science from The City University in 1997.[6]
- Elected a Fellow of the British Academy (FBA), where she also served as Vice-President in 2000–2002[35]
- Fellow of European Association for Artificial Intelligence (ECCAI)[34]
- Association for Information Science and Technology (ASIS&T) Award of Merit (2002)[16]
- Association for Computational Linguistics (ACL) Lifetime Achievement Award (2004)[36]
- ACM - AAAI Allen Newell Award (2006)[35]
- BCS Lovelace Medal (2007)[35]
- Association for Computing Machinery (ACM) Women's Group Athena Award (2007)[16]
Death and legacy
Spärck Jones died on 4 April 2007, due to cancer at the age of 71.[10]
In 2008, the BCS Information Retrieval Specialist Group (BCS IRSG) in conjunction with the British Computer Society established an annual Karen Spärck Jones Award in her honour, to encourage and promote research that advances understanding of Natural Language Processing or Information Retrieval.[6] The Karen Spärck Jones lecture sponsored by BCS recognises the contribution that women have made to computing.[37]
In August 2017, the University of Huddersfield renamed one of its campus buildings in her honour. Formerly known as Canalside West, the Spärck Jones building houses the University's School of Computing and Engineering.[38]
When Spärck Jones died in 2007, The Times did not publish an obituary for her, despite having published one for her husband Roger Needham in 2003.[10] In 2019, The New York Times included her in its Overlooked series under the title "Overlooked No More: Karen Sparck Jones, Who Established the Basis for Search Engines."[10]
In 2024, the University of Cambridge and the UK Government's Department for Science, Innovation and Technology jointly launched the Spärck AI Scholarships to support the next generation of AI researchers, named in her honour.[39]
References
- "Jones, Karen Ida Boalth Spärck (1935–2007), Computer Scientist". Oxford Dictionary of National Biography (online ed.). Oxford University Press. doi:10.1093/ref:odnb/98729. (Subscription, Wikipedia Library access or UK public library membership required.)
- Video: Natural Language and the Information Layer, Karen Spärck Jones, March 2007
- University of Cambridge obituary
- Obituary, The Independent, 12 April 2007
- Robertson, S.; Tait, J. (2008). "Karen Spärck Jones". Journal of the American Society for Information Science and Technology. 59 (5): 852. doi:10.1002/asi.20784.
- "Karen Spärck Jones Award | BCS". www.bcs.org. Retrieved 21 January 2023.
- Sparck Jones, Karen (31 January 2005). "2002 ASIST Award of Merit: Karen Sparck Jones". Bulletin of the American Society for Information Science and Technology. 29 (3): 12–14. doi:10.1002/bult.274. Retrieved 12 March 2025.
- "Computing's too important to be left to men | BCS".
- Padnani, Amisha; Bennett, Jessica (8 March 2018). "Remarkable People We Overlooked in Our Obituaries". The New York Times. ISSN 0362-4331. Retrieved 7 December 2019.
- "Overlooked No More: Karen Sparck Jones, Who Established the Basis for Search Engines". The New York Times. 2 January 2019. ISSN 0362-4331. Retrieved 7 December 2019.
- Spärck Jones, K. (1972). "A Statistical Interpretation of Term Specificity and Its Application in Retrieval". Journal of Documentation. 28: 11–21. CiteSeerX 10.1.1.115.8343. doi:10.1108/eb026526. S2CID 2996187.
- Tait, John I., ed. (2005). Charting a New Course: Natural Language Processing and Information Retrieval, Essays in Honour of Karen Spärck Jones. The Kluwer International Series on Information Retrieval. Vol. 16. doi:10.1007/1-4020-3467-9. ISBN 978-1-4020-3343-8.
- Obituary, The Times, 22 June 2007 (subscription required)
- Computer Science, A Woman's Work, IEEE Spectrum, May 2007
- Thompson, Bill. "Karen Spärck Jones". A Stick a Dog and a Box With Something in It. Retrieved 1 August 2019. (originally published in The Times)
- Tait, J. I. (2007). "Karen Spärck Jones". Computational Linguistics. 33 (3): 289–291. doi:10.1162/coli.2007.33.3.289. S2CID 19790552.
- Pulman, S. G. (2011). "Karen Ida Boalth Spärck Jones 1935–2007" (PDF). Proceedings of the British Academy. 166 (IX).
- Robertson, S., & Tait, J. (2008). Karen Spärck Jones. Journal of the American Society for Information Science & Technology, 59(5), 852–854.
- Spärck Jones, Karen (10 April 2001). "Karen Spärck Jones, an oral history conducted in 2001". IEEE History Center (Interview). Interviewed by Janet Abbate. Piscataway, NJ.
- Karen Spärck Jones (1986). Synonymy and Semantic Classification (thesis published as a book). Edinburgh Information Technology series. Vol. 1. Edinburgh University Press. ISBN 9780852245170.
- Anon (2007). "Karen Spärck Jones, FBA Professor Emerita of Computers and Information Honorary Fellow of Wolfson College 26 August 1935 – 4 April 2007". University of Cambridge.
- "Karen Sparck Jones Publications".
- Spärck Jones, K. (1973). "Index term weighting". Information Storage and Retrieval. 9 (11): 619–633. doi:10.1016/0020-0271(73)90043-0.
- Maybury, M. T. (2005). "Karen Spärck Jones and Summarization". Charting a New Course: Natural Language Processing and Information Retrieval. The Kluwer International Series on Information Retrieval. Vol. 16. pp. 99–10. doi:10.1007/1-4020-3467-9_7. ISBN 978-1-4020-3343-8.
- "Women Who Changed Tech – Karen Spärck Jones". Extreme Networks. 25 May 2023. Retrieved 7 April 2026.
- Willett, P.; Robertson, S. (2007). "In memoriam: Karen Spärck Jones". Journal of Documentation. 63 (5). doi:10.1108/jd.2007.27863eaa.001.
- "Karen Spärck Jones: The Search Engineer Enabler". History of Data Science. 3 December 2021. Retrieved 7 April 2026.
- Jones, Karen Sparck (September 1991). "The role of artificial intelligence in information retrieval". Journal of the American Society for Information Science. 42 (8): 558–565. doi:10.1002/(SICI)1097-4571(199109)42:8<558::AID-ASI4>3.0.CO;2-4. ISSN 0002-8231.
- "Karen Spärck Jones: A pioneering Girtonian and interdisciplinary role model". Girton College, Cambridge. 9 June 2025. Retrieved 7 April 2026.
- "Computing's too important to be left to men". British Computer Society. Retrieved 7 April 2026.
- "Karen Spärck Jones: A pioneering Girtonian and interdisciplinary role model". Girton College, Cambridge. 9 June 2025. Retrieved 7 April 2026.
- "Gerard Salton Awards". Special Interest Group on Information Retrieval. Retrieved 2 April 2018.
- Anon (2022). "Elected AAAI Fellows". aaai.org.
- "Karen Spärck Jones". The Computer Laboratory, Cambridge University. March 2007. Retrieved 2 April 2018.
- "Karen Spärck Jones". The Daily Telegraph. 12 April 2007.
- "ACL Lifetime Achievement Award Recipients". ACL wiki. ACL. Retrieved 16 August 2014.
- "Karen Spärck Jones lecture". BCS Academy of Computing. British Computer Society. Retrieved 3 October 2013.
- "How to find us – University of Huddersfield". hud.ac.uk. Retrieved 20 September 2017.
- "Karen Spärck Jones: A pioneering Girtonian and interdisciplinary role model". Girton College, Cambridge. 9 June 2025. Retrieved 7 April 2026.