CamelCase vs underscores: Revisited

It has been 2 years since I published “CamelCase vs underscores: Scientific showdown”, and it still is easily the most visited article on this blog. Yesterday alone it got 2,614 views thanks to a forum post on Y Combinator, pretty much suppressing my normal visit rates entirely. What is it that makes it such a hot topic? Honestly, it doesn’t interest me that much anymore since there are many more important ways by which to make your code more readable; note it is code comprehension we are talking about here, not how fast you can write code! Before I outlined how the entire discussion could be made obsolete by moving away from a textual representation of code, and in my previous post I related software design principles as an act of communication to the cooperative principle in Linguistics. Nonetheless, given the immense interest this article seems to be getting I feel it’s my obligation to report on follow-up research of the previously discussed paper “To camelcase or under_score” by Binkley et al. (2009) (PDF).

In “An Eye Tracking Study on camelCase and under_score Identifier Syles” by Sharif and Maletic (2010) (PDF) the previous study is replicated but deviates from it in a few points:

Only programmers are used as subjects.
All of the subjects had experience with both styles and their preference of style was approximately split even among the groups.
Most of the subjects were historically trained in the underscore style. (The opposite was true in the study by Binkley et al.)
Eye tracking is used to measure fixation count and rate. Results from previous eye tracking studies in the domain of cognitive psychology imply that camel-cased identifiers should be more difficult to read compared to underscored identifiers.

No difference in accuracy was reported (as opposed to Binkley et al.), but on average, camel-cased identifiers took 932ms (20%) longer than underscored identifiers, in line with the 13,5% longer as reported by Binkley et al. The eye tracking results also give some insight into visual effort. Camel-cased identifiers require a higher average duration of fixations.

When interested into the details of the studies, don’t forget to read the papers yourself. I linked to them for your convenience, but if the links break you can easily find them by looking up their titles on Google Scholar.

It seems in general the subject has gotten more attention over the past 2 years in research. You can find relevant resources yourself by checking out the ‘Citing Documents’ of the discussed papers, but here are a few interesting ones:

Woman and men – Different but equal: On the impact of identifier style on source code reading by Sharafi et al. (2012) (PDF)
Context and Vision: Studying Two Factors Impacting Program Comprehension by Soh, Z. (2011)
Can Better Identifier Splitting Techniques Help Feature Location? by Guerrouj et al. (2011) (PDF)

Author: Steven Jeuris

I have a PhD in Human-Computer Interaction and am currently working both as a software engineer at iMotions and as a postdoc at the Technical University of Denmark (DTU). This blend of research and development is the type of work which motivates and excites me the most. Currently, I am working on a distributed platform which enables researchers to conduct biometric research 'in the wild' (outside of the lab environment). I have almost 10 years of professional software development experience. Prior to academia, I worked for several years as a professional full-stack software developer at a game development company in Belgium: AIM Productions. I liked the work and colleagues at the company too much to give up entirely for further studies, so I decided to combine the two. In 2009 I started studying for my master in Game and Media Technology at the University of Utrecht in the Netherlands, from which I graduated in 2012. View all posts by Steven Jeuris

10 thoughts on “CamelCase vs underscores: Revisited”

Pingback: CamelCase vs underscores: Scientific showdown « Whathecode

Your arguments are interesting, but Chris Done has a [conciser and appealing arguments](http://chrisdone.com/posts/camelcase-vs-underscores-vs-hyphens), in my opinion. 🙂

Steven Jeuris says:

March 4, 2015 at 1:31 am

I’d agree (and this is also in line with both studies), but unfortunately not an option in many languages.

Reply

Hello. Thank you for articles on case. I am not a coder. I am a project lead for records management. We have several options to name our records:
1. Underscore
2. Hyphens
3. Spaces
4. Camel case.

I want the e-file names to readily transfer from one platform to another. I am concerned that spaces will not transfer consistently and correctly. That is we lose part of the name.

Underscores get very low up take. People just forget or don’t bother. In the world of 2 finger typers, underscores are hard.

We want to be able to scan very long lists of documents and see how they are related (naming conventions) and be able to retrieve quickly.

Your articles provided some things to consider but I rejected CamelCase due to its 20% increase to identify. We literally have millions of items to name and potentially retrieve. 20% is considerable under those circumstances.

Thank you again for your posts. Enjoyed them immensely.

I go back and forth between JavaScript and PHP. PHP uses snake case for built in functions while JavaScript uses camelcase. What I would like to be able to do is use snake case for my custom work in JavaScript and camelcase in PHP. I can’t do it, but it would be nice to easily see what is mine and what is theirs.

When I used Microsoft VB6 I enjoyed their built in camelcase naming because they identified what type of variable I was looking at. I found that helpful. Example: btnSubmit For whatever reason I found that very readable. Perhaps because of the repetition of the prefix.

That second study you mention states “In this study, subjects were trained primarily in
the underscore style (…)” without stating explicitly if or what has been done to account for this bias. Without a control group trained in no style or in camel case style it’s very hard to tell how this influenced the results. Which is why I’d still give more credit to the first one.

Interesting study. I am particularly interested in names of TEST methods only. I am not about to try and fight the idiomatic style in any language for production code. However I do argue, repeatedly, that for tests code functions the context differs.

Test functions are;
1. rarely called explicitly
2. typically longer then production code methods
3. designed to be read

I do think this is a compelling use case for underscore where idiomatic or dogmatic arguments might override readability in _normal code_

example (C#)
[Test]
public void a_user_must_have_a_valid_email_address()

The study has proven that CamelCase is slower and harder to read, fight is supposed to be over, but as usual humanity will take decades to understand and apply this piece of info.

Anon says:

June 13, 2020 at 7:29 pm

Lol, decades? More like centuries.

Reply

Pingback: 2 – Snake_case_is_the_best_case - Wazooy Startups

Share this:

Author: Steven Jeuris

10 thoughts on “CamelCase vs underscores: Revisited”

Leave a comment Cancel reply