Fun with Char
One of the things I found to be really interesting early on in the Stanford programming methodology course was chars in Java. I found that I was taking a lot for granted about characters when it comes to computers. I know what an 'A' and a 'B' and a 'C', and so on, are and so I thought that computers understood them in the same way. I think an 'A' and an a are two versions of the same character, upper and lower case. Therefore I sometimes found myself wondering why there was this case sensitivity thing you had to deal with with computers. Doesn't the computer know an 'A' is the same character as an 'a'? So I found it rather eye opening, and oddly fascinating. to learn that an 'A' doesn't mean anything to the computer, it's actually just a number. What's more an 'a' is also just a number, and it's a different number. So of course case sensitivity is an issue for a computer because there's nothing that really ties the two characters together for the computer, they're two entirely different characters, really two different numbers.The ASCII Table
I had seen and heard about the ASCII table before but was only using it as a reference for the sort order of the numbers and special characters in relation to the letters. Now I was coming to have a greater appreciation for what had been established and standardized so that we could effectively communicate with one another through different computers.
I learned that an 'A' was decimal number 65 and an 'a' was number 97. I learned that a '0' was actually the number 48.
So then I learned how to write my own method to change an upper case character to it's lower case equivalent. I learned that you could do math on letters! Yes, I am fully aware that the exclamation point at the end of that sentence means that I'm a geek. What's surprising to me isn't that I found this realization fascinating, but that I hadn't discovered this sooner in life so I could geek out on it.
I learned that to change an uppercase 'A' to a lowercase 'a' you actually add 32 (65 + 32 = 97), and the same is true for every other letter. But what's better, I learned that, even though ASCII is unlikely to be changed or abandoned, you wouldn't want to put 32 in your code, instead you want to use the characters themselves and let the computer figure out the number difference between them, just in case they ever did change.
What you know is that 'A' through 'Z' will always be sequential numbers in whatever code is used and 'a' through 'z' will always be sequential. In ASCII there are six extra characters ( [ \ ] ^ _ ` ) between 'Z' and 'a' (26 letters + 6 characters = 32). But should some other table ever be used, the difference may no longer be 32. So what you write to convert an upper case character to it's the lower case version is something like this...
public static char toLowerCase(char ch){
if (ch >= 'A' && ch <= 'Z') { /* to see if input character is upper case */
return ch + 'a' - 'A';
/* dynamically determines and adds the difference between the two */
} else {
return ch; /* if not upper case, return as is */
}
}
Since they are really just numbers, you can also use char in control statements to loop through the equivalent letters, such as...
for (char ch = 'A'; ch <= 'Z'; ch++)
This is actually a simple concept but it was an eye opening and fun realization.
No comments:
Post a Comment