Handling text is an important part of every programmers life, and there are many subtleties to it. Lately I have started to realize that all text I handle in my everyday work falls into three categories:
- Free text: this includes names of people and places, contents of chat messages, books, etc.
- Code: this includes C++, JSON, HTML, etc – anything that is both human and machine readable and writable.
- Identifiers: these are unique names, e.g. instance keys in JSON, file names, database keys and resource names.
I think we as programmers can make our lives much simpler if we decide to stick to these three categories, and to all agree on their character sets and encodings.Read More