I’ve always had a special fondness for my name, which — according to Ryan Gosling in “Lars and the Real Girl” — is a scientific fact for most people (Ryan Gosling constitutes scientific proof in my book). Plus, the root word for Hilary is the Latin word “hilarius” meaning cheerful and merry, which is the same root word for “hilarious” and “exhilarating.” It’s a great name.
Several years ago I came across this blog post, which provides a cursory analysis for why “Hillary” is the most poisoned name of all time. The author is careful not to comment on the details of why “Hillary” may have been poisoned right around 1992, but I’ll go ahead and make the bold causal conclusion that it’s because that was the year that Bill Clinton was elected, and thus the year Hillary Clinton entered the public sphere and was generally reviled for not wanting to bake cookies or something like that. Note that this all happened when I was 7 years old, so I spent the formative years of 7-15 being called “Hillary Clinton” whenever I introduced myself. Luckily, I was a feisty feminist from a young age and rejoiced in the comparison (and life is not about being popular).
In the original post the author bemoans the lack of research assistants to perform his data extraction for a more complete analysis. Fortunately, in this era we have replaced human jobs with computers, and the data can be easily extracted using programming. This weekend I took the opportunity to learn how to scrape the social security data myself and do a more complete analysis of all of the names on record.
Is Hilary/Hillary really the most rapidly poisoned name in recorded American history? An analysis.
I will follow up this post with more details on how to perform web-scraping with R (for this I am infinitely indebted to my friend Mark — check out his storyboard project and be amazed!). For now, suffice it to say that I was able to collect from the social security website the data for every year between 1880 and 2011 for the 1000 most popular baby names. For each of the 1000 names in a given year, I collected the raw number of babies given that name, as well as the percentage of babies given that name, and the rank of that name. For girls, this resulted in 4110 total names.
In the original analysis, the author looked at the changed rank of “Hillary.” The ranks are interesting, but we have more finely-tuned data than that available from the SSA. The raw numbers of babies named a certain name are likewise interesting, but do not normalize for the population. Thus the percentages of babies named a certain name is the best measurement.
Looking at the absolute chance in percentages is interesting, but would not tell the full story. A change of, say 15% to 14% would be quite different and less drastic than a change from 2% to 1%, but the absolute change in percentage would measure those two things equally. Thus, I need a measure of the relative change in the percentages — that is, the percent change in percentages (confusing, I know). Fortunately the public health field has dealt with this problem for a long time, and has a measurement called the relative risk, where “risk” refers to the proportion of babies given a certain name. For example, let’s say the percentage of babies named “Jane” is 1% of the population in 1990, and 1.2% of the population in 1991. The relative risk of being named “Jane” in 1991 versus 1990 is 1.2 (that is, it’s (1.2/1)=1.2 times as probable, or (1.2-1)*100=20% more likely). In this case, however, I’m interested in instances where the percentage of children with a certain name decreases. The way to make the most sensible statistics in this case is to calculate the relative risk again, but in this case think of it as a decrease. That is, if “Jane” was at 1.5% in 1990 and 1.3% in 1991, then the relative risk of being named “Jane” in 1991 compared to 1990 is (1.3/1.5)=0.87. That is, it is (1-0.87)*100=13% less likely that a baby will be named “Jane” in 1991 compared to 1990.
(Note that I’m not doing any model fitting here because I’m not interested in any parameter estimates — I have my entire population! I’m just summarizing the data in a way that makes sense.)
So, for each of the 4110 names that I collected, I calculated the relative risk going from one year to the next, all the way from 1880 to 2011. I then pulled out the names with the biggest percent drops from one year to the next.
#6?? I’m sorry, but if I’m going to have one of the most rapidly poisoned names in US history, it best be #1. I didn’t come here to make friends, I came here to win. Furthermore, the names on this list seemed… peculiar to say the least. I decided to plot out the percentage of babies named each of the names to get a better idea of what was going on. (Click through to see the full-sized plot. Note that the y-axis is Percent, so 0.20 means 0.20%.)
These plots looked quite curious to me. While the names had very steep drop-offs, they also had very steep drop-ins as well.
This is where this project got deliriously fun. For each of the names that “dropped in” I did a little research on the name and the year. “Dewey” popped up in 1898 because of the Spanish-American War — people named their daughters after George Dewey. “Deneen” was one name of a duo with a one-hit wonder in 1968. “Katina” and “Catina” were wildly popular because in 1972 in the soap opera Where the Heart Is a character is born named Katina. “Farrah” became popular in 1976 when Charlie’s Angels, starring Farrah Fawcett, debuted (notice that the name becomes popular in 2009 when Farrah Fawcett died). “Renata” was hard to pin down — perhaps it was popular because of this opera singer who seemed to be on TV a lot in the late 1970s. “Infant” became a popular baby name in the late 1980s for reasons that completely defy my comprehension, and that are utterly un-Google-able. (Edit: someone pointed out on facebook that it’s possible this is due to a change in coding conventions for unnamed babies. This would make more sense, but would also make me sad. Edit 2: See the comments for an explanation!)
I think we all know why “Iesha” became popular in 1989:
“Khadijah” was a character played by Queen Latifa in Living Single, and “Ashanti” was popular because of Ashanti, of course.
“Hilary”, though, was clearly different than these flash-in-the-pan names. The name was growing in popularity (albeit not monotonically) for years. So to remove all of the fad names from the list, I chose only the names that were in the top 1000 for over 20 years, and updated the graph (note that I changed the range on the y-axis).
I think it’s pretty safe to say that, among the names that were once stable and then had a sudden drop, “Hilary” is clearly the most poisoned. I am not paying too much attention to the names that had sharp drops in the late 1800s because the population was so much smaller then, and thus it was easier to drop percentage points without a large drop in raw numbers. I also did a parallel analysis for boys, and aside from fluctuations in the late 1890s/early 1900s, the only name that comes close to this rate of poisoning is Nakia, which became popular because of a short-lived TV show in the 1970s.
At this point you’re probably wondering where “Hillary” is. As it turns out, “Hillary” took two years to descend from the top, thus diluting out the relative risk for any one year (it’s highest single-year drop was 61% in 1994). If I examine slightly more names (now the top 39 most poisoned) and again filter for fad names, both “Hilary” and “Hillary” are on the plot, and clearly the most poisoned.
(The crazy line is for “Marian” and the spike is due to the fact that 1954 was a Catholic Marian year — if it weren’t an already popular name, it would have been filtered as a fad. And the “Christin” spike might very well be due to a computer glitch that truncated the name “Christina”! Amazing!!)
So, I can confidently say that, defining “poisoning” as the relative loss of popularity in a single year and controlling for fad names, “Hilary” is absolutely the most poisoned woman’s name in recorded history in the US.
Code for this project is available on GitHub.
(Personal aside: I will get sentimental for a moment, and mention that my mother was a at Wellesley the same time as Hillary Rodham. While she already knew that she wanted to name her future daughter “Hilary” at that point, when she saw Hillary speak at a student event, she thought, “THAT is what I want my daughter to be like.” Which was empirically the polar opposite of what the nation felt in 1992. But my mom was right and way ahead of her time.)
Update: This seems to be an analysis everyone is interested in. For perhaps the first time in internet history, Godwin’s Law is wholly appropriate.
I was nine when H.C. became the first lady and I really didnt get the humor behind being asked if I was Hillary Clinton when I introduced myself. I was like, “I’m nine. Do I look like the first lady to you?” I just can’t wait til people start calling us “Madam President.” Thanks for this hilarious investigation and article.
You’re welcome! And I can’t wait for that too!! 🙂
Congratulations, very well written!
I had a great uncle Adolph who felt it prudent to change names in the 40s; I imagine that proportionally that name fell from grace in a spectacular fashion, but I don’t know how popular it was in raw numbers to start with.
I’ve often joked that British mens’ names are slowly all becoming names for American women, as are surnames; might be interesting to crunch the numbers on gender ratio of names through time, too.
That’s fascinating — I wonder if we could find data for people who change their names. Surely Adolph/Adolf take the gold for that one!
Well Harpo Marx’s original name was “Adolph” he changed it to Arthur during WWII
Also “Hilary” was originally a British man’s name :D. I want to look at gender ratios eventually.
Many many women’s names begin as men’s names. it being cultural acceptable to give a masculine name to girls, but almost never a feminine name to boy. it’s very hard to find a name that went from being women’s name to a men’s name
Adolf and Adolph did decline in popularity in the US during and after the Second World War. However, they had already been in decline prior to the war starting, as were other German names. Given that German immigration to America was slowing, this isn’t very surprising.
What about the boys? I guess you need another weekend….
I wouldn’t expect them to challenge you for the topspot. Boys names have less entropy through the years, see:
so one interpretation is that they are less likely to drop as quickly. Although it is shocking that Hilary dropped faster than Adolf!?!?!
I did briefly look at boys and definitely saw what the babynamewizard saw — the only name that came close to this level of poisoning was Nakia, another fad name.
I’m creating a plot of Adolf/Adolph right now! It is a surprisingly stable name.
I love the name Hillary, and it should be SO popular right now—and yet, you’re completely right, it’s done for. I wonder how long before the association (which I too consider positive) wears off?
Do note that the SSA happily shares the information in condensed form, no need to do any scraping:
Ah yes someone pointed me to this today. Oh well!
I can tell you about “Infant”. In 1990 the U.S. passed legislation that required SSNs on the tax return for anyone claiming dependents. This cracked down on a type of tax fraud where people would claim numerous non-existent dependents (I have, uh, 23 children… hey, no taxes this year!) to reduce their taxes. In conjunction with that, they had a program for hospitals to give new parents a form to request a SSN for their newborn. But the catch is, you had to have a name for the child. While most people gave their children names at birth, some didn’t (various cultural practices delay naming). But anyway, these people were now encouraged to name their kids and apply for an SSN immediately at birth. That is almost certainly the reason for the sudden non-popularity of “infant” as a name.
Oh wow, thank you for pointing this out!! That explains it perfectly.
Fascinating and rigorous analysis! Wow. This is a much better analysis of baby name trends over time than the chapter in Freakonomics. And we can all look at your analysis code on github. Good work! Convincing. Could you run it and show us the graph for the most poisoned male names?
Great writing, Hillary. Intelligent, witty, scientific and humorous all at the same time. I encourage you to write more – perhaps a book is in your future…
Really interesting stuff. I wondered where the heck Catina/Katina came from.
Farrah’s drop off came in 1978-79 with the highly publicized arrival in the US of the deposed Shah of Iran Reza Pahlavi and his wife …Farah.
Fun article and interesting. Regarding a drop off in popularity for boy’s names, try Adolph in the 1940’s. Saw this on some site that measured popularity of names – it was a real “name cliff” drop.
Whoops. I didn’t read all the comments before posting. I see Adolph has been discussed.
I’ve always suspected that my name, “Christine”–not spelled with a K or Kr at the beginning or A at the end–was poisoned due to the Stephen King book and movie. Both came out in 1983, and according to your chart, “Christin” spiked at about 1990. Did your computer glitch truncate “Christine” and not “Christina”? If so, then there’s some credence to my suspicion.
Yes it would have truncated “Christine” as well!
What about “Monica” after Bill Clinton’s whole…well, you know.
Monica drops 25% and 35% in ’98 and ’99, respectively. Nowhere near Hilary’s 70%.
If I recall correctly Monica had some huge growth in the ’90s after the debut of Friends, but dropped off pretty quickly (as you describe) after the Lewinsky affair.
I seem to remember that in the UK Jade fell over 70 places in a year when it ceased to be associated with Jade Jagger and became attached to a Big Brother contestant who had been accused of racist bullying. It has never been popular since.
Meta-analysis – the hillbilly Republicans who were naming their kids Hillary stopped when Rush Limbaugh told them she was evil and wanted to take away their right to pay $16 for a Tylenol.
Now I want to have a daughter and name her Hillary, in honor of two Americans – our First Lady/Senator/Secretary of State and the author of this great research article.
Perhaps she should change her name before running for President?…
Reblogged this on Stats in the Wild and commented:
Hilary: the most poisoned baby name in US history
Since I have the name and spell it Hillari – I have to think I have THE most poisoned name – EVER 🙂 Thank you, I thoroughly enjoyed reading.
The fact that Hillary didn’t decline as rapidly as Hilary tells me that either:
1) Some who would have named their child Hilary in 1992-1994 instead named her Hillary, or
2) There were some folks who might have named their daughter Rachel or something but thought H.C. would be a good namesake and picked Hillary instead.
I suspect that accounting for these would change the numerical results but would certainly not change the qualitative fact that Hilary/Hillary is the most poisoned name of all time in America (I am guessing that Adolph probably got a lot less popular in Germany between 1944-1946)
When expecting our 3rd daughter I did similar analysis, but looking to avoid the trendy crowd. See The perfect baby name: one Dad’s quest using SAS. If you pick a less common name, less chance of it becoming poisoned by someone else who shares it.
The rapid-rise/rapid-fall trend in baby names has been described in:
Berger, J., & Mens, G. (2009). How adoption speed affects the abandonment of cultural tastes. Proceedings of
the National Academy of Sciences, 106(20), 8146–8150.
Cool! So fun to find that empirically and then see that there’s already been research in the area!
There are some interesting patterns when you try to search books for all these names using Google Ngram here: http://goo.gl/2wobi … and even more so when you search for Hilary (dating back to 1538) here http://goo.gl/2mvjW
Huh, that is interesting. There was a Pope Hilary, and I’d imagine it has to do with that. However he was Pope in the 5th century.
It being the 5th century and all, he was actually Pope Hilarius — my favorite papal moniker of all time.
Hey Hilary, fascinating write-up!
A while back, I did some analysis of the UK baby name data, which we have back to 1996: http://www.guardian.co.uk/news/datablog/2012/apr/25/baby-names-data
Just as you find with Hilary/Hillary in the US, in the UK there’s a rapid dip in the name “Cherie” in the years after Tony Blair became prime minister in 1997 – Cherie Blair is his wife, and she was also attacked a lot by the press, for having the temerity to be a successful lawyer.
I build an interactive tool for searching the UK data, you can see the dip in Cherie here:
It’s depressing, frankly.
I didn’t do an analysis of the biggest one-year drops, but I did look at the biggest proportional falls from 1996-2010 overall, using the same method as you. I found that the girls’ names in the UK with the biggest drops in the period were: Brittany, Jordan, Courteney, Lauryn and Kirby. (I can see what’s going on with all those except Kirby.)
Hi, I have a question about one of your steps. When you decided to omit the names from the late 1800s because there was a smaller population doesn’t that defeat the purpose of using the relative risk ratio? But If what you’re saying is true, that it was easier for a name to drop in popularity because there were fewer people, wouldn’t it be necessary to compare the RR ratio for a name at a given year to the averaged rate of all girl names that declined in that year? E.g. the likelihood of Randi declining from 1982 to 1983 is 16% but if the average of all female names that faced a popularity decline from 1982 to 1983 was 33%. Even if Randi does show such a pretty big decline during that year isn’t that more popular than a name that drops 5% compared to a population mean of .5% for all popularity drops for a given year? I’m not being patronizing btw, I genuinely don’t know if I’m even in the right ballpark with this question. Thanks!
No problem! You totally caught me in my most hand-wavey moment. Yes, me omitting the names for the 1800s because there was a smaller population does sort of defeat the purpose of using the relative risk in theory. But at the same time, with very low population numbers the variance of the relative risk increases. If you look at the graphs you can see this — you can’t really identify where the 67-69% drops are on the “Clementine”, “Minna” and “Celestine” graphs because drops that big weren’t really out of the ordinary (because of the higher variance, i.e. the lines look equally noisy throughout). However with the larger population numbers, the relative risk is more stable year to year because there is decreased variance. So a drop of 70% is much more noticeable in the “Hilary” graph than the “Clementine” one.
In an official analysis we might determine a sample size cutoff or actually estimate the variance in order to determine “fair” comparisons. In this case I just figured hand-waving was best. It’s a great trick in the statistical toolbox, and one I chose to use because even without the handwaving, “Hilary” still wins :D.
THis is totally awesome. BUT I have to admit my first thought was that you were going to say Hilary/Hillary was the most poisoned name…like LITERALLY POISONED. Like the people who had it were poisoned more than anyone else.
Has anyone ever done a study of that? What names are shared by the most victims of poisoning?
This proves nothing, but a William (Bill) Deneen put out an Encyclopedia Brittanica film in 1963 called Japan: Miracle in Asia which he photographed out a window of a single-engine plane he was piloting. This could have caught the popular imagination enough to make his name popular for a short time. http://archive.org/details/japan_miracle_in_asia_1963
Not a bad hypothesis! Though I’d be surprised if this spurred a popular girls’ name, rather than a boys’ name.
When asked which of his novels was his favorite, Nabokov wonderfully replied:
“I would say that of all my books Lolita has left me with the most pleasurable afterglow—perhaps because it is the purest of all, the most abstract and carefully contrived. I am probably responsible for the odd fact that people don’t seem to name their daughters Lolita any more. I have heard of young female poodles being given that name since 1956, but of no human beings”
I love this quote, I love Nabokov, and I love “Lolita” — one of my favorites. However, sadly for Nabokov it isn’t quite true.. in fact there was a spike of girls named Lolita in the early ’60s. https://hilaryparker.com/2013/01/30/hilary-the-most-poisoned-baby-name-in-us-history/names_lolita/
The spike coincides with the release of the film version of Lolita with James Mason and Shelley Winters. Books don’t cut it, when it comes to mass cultural reactions.
Makes total sense–thanks for pointing that out! It’s exactly counter to what Nabokov was claiming, of course…
So I was born in 1972 and named Trinity. I had never heard my name said as a name (not directed to me) until I saw The Matrix in the theater. Until then I had only heard my name in relation to the spaghetti westerns (Trinity Is My Name) and churches. Oh and I’m a programmer so yeah, I’d like my name back …
Reblogged this on Datapolitan and commented:
Great post on using statistics to defend your good name.
This is really awesome to me, as a data geek myself I really appreciate the clear graphics and awesome research that went into this. And managing to make it an easy read on top of that is spectacular. Also, finding really weird new names I’ve never heard before.
The only thing I can suggest is to put the definition of ‘poisoning’ first because I read through the whole article like ‘wha? – what does a poisoned name mean?’
Also reblogging on: The Rant Of the Databasefairy (www.noelhollis.com)
Very late to the party here, but I would love to see the converse of this — the most rapid rises from obscurity. I’ve got my money on “Madison”…
Ah! Actually Nathan Yau at Flowing Data looked at just that! There was some overlap. http://flowingdata.com/2013/07/29/the-most-trendy-names-in-us-history/
I love this! I’m another Hilary! Also, I was named after Hilary Brown, who is a television reporter. My parents couldn’t agree on a “girl” name they both liked. One evening, they were watching the news, and Hilary Brown was reporting. They both liked her name and thus… I was bequeathed.