Chapter V: Computation | The Philosophy Of Science by Steven Gussman [1st Edition]
"Some authors such as Konrad Lorenz, John Barrow, Jürgen Schmidhuber and Stephen Wolfram
have gone as far as suggesting that the laws of nature are both computable and finite like a
cellular automaton or computer simulation."
– Max TegmarkI
Computer science (which may as well be called computer mathematics) is necessary for a full understanding of both math and science, and it further ties in engineering. This is because, even for students who do well in math classes, it can sometimes be unclear why they are performing the rules and calculations they are performing. By placing the coder in the role of god, computer programming allows one to see how they might reverse engineer the world, which in turn makes them better at discovery on the other side of things in the cosmos.
To begin with, one must understand the basics of computer hardware: that is, the physical machine that is the computer. The details have become more complex in many architectures, but at base, computers are Von Neumann machines:II they consist of a central processing unit (CPU), memory (random-access memory, or RAM), storage—a hard disk drive (HDD) or a solid-state drive (SDD), some sort of output medium or display (such as a monitor-screen or a printer), and some form of user-input device (such as a keyboard, mouse, video game controller, or webcam). The computer is most closely identified with the CPU; the processor is, as its name suggests, the component which does all of the calculating! All arithmetic or logical operations are performed here, in the form of binary, in this now-small physical component. Some machines will have slower or faster processors, different instruction set architectures (ISAs—the version of binary language that a particular processor uses, defining the elementary operations it may perform to implement larger operations), multiple cores (meaning one chip with multiple CPUs cooperating in tandem), and even dedicated graphics processing units (GPUs) designed for efficiently drawing 2D and 3D imagery to screens. The CPU actually has a bit of very fast memory on-board, the register(s), which is where data is temporarily stored and processed by the unit. But most of the currently running programs (or processes), resources such as the compiled machine code itself and any open files (such as text, images, audio, and video) are temporarily stored in RAM (this memory is considered volatile because it will delete when a process is closed, and completely wipe every time a computer is shut off). While not quite as fast as a register, RAM is quite fast (which is why it is of medium size and used as temporary memory while a program is running). When a program is not running, or a resource is not being used, it is merely waiting in storage—the slowest, cheapest, largest amount of storage-memory in the machine. Users are familiar with storage hardware if they have ever used a hard drive, USB / flash drive, or SD card, and they are familiar with the concomitant software anytime they explore their file-system (to view their files with file extensions such as .txt or .jpg). If one actually opens, say an image file, to view it, the data associated with the image is copied into the fast RAM and an algorithm is run on the processor to interpret the data structure so that it may draw the associated image to the screen (which will likely include progressively moving portions of the image into and out of the registers as it's drawn). On modern machines, this process will all appear to have happened instantaneously, but recognize that even seemingly small processes likely consist of your CPU performing many tens-of-thousands-or-more calculations in well under a second. Finally, while some computers may be largely set up to run processes without constant use of local input or monitors (for example, the servers keeping the internet up and running—for whom the rest of the machines in the world, the clients, are technically the constant input and display hardware), most are familiar with an intimate relationship between computers and screens, keyboards, and mice. Computers originally printed output on a ticker-tape, but today, everything from simple text to advanced 3D video games are all drawn on high resolution screens so that the programmer and user alike can monitor what the programs are actually doing, in real-time, with less waste. To change what a computer is doing, programmers and users alike will use mice to click around a graphical user interface (GUI), a keyboard to type commands or to write a book, or a video game controller to make Mario jump!
Often, computer science is taught in pseudo-code—that is, rather than in any particular programming language, examples are given in a semi-English / semi-computational language. For the same reason one can do this, I will not: most modern programming languages are quite similar, and it will not tend to be terribly difficult to translate between them (certainly the differences between actual programming languages in use today will tend to be smaller than those between the different natural human languages spoken!). Therefore, I think it would be more aesthetically pleasing and educationally functional to use the syntax for the versatile modern programming language known as C#.
The point of this chapter is moreso to get the reader to think like a computer scientist than to learn to write C# programs, but one should have a jump-start on both by the end of it. As in the “Logic” and “Mathematics” chapters, some elementaries are in order. A computer programming language is a formal language that is ultimately parsed and run by a computer's central processing unit (CPU) so as to make it perform tasks; personal computers (such as your desktop, laptop, or smart phone) are general computers—the beauty is that you can get them to do just about anything if you're clever enough. A programming language has a general syntax (the actual rules of how the code is written) and functionality built into a given development environment with which you write the programs for your computer to execute (such integrated development environments, or IDEs, can be thought of in relation to a computer program the way a word processor, such as OpenOffice Writer, is for a book).
Much of what a computer programmer does (and ultimately all of it, at the lowest level) is mathematics, because your computer is essentially a complex calculator (in fact, so to is the cosmos, it seems). One has probably heard before that with computers, “it's all ones and zeroes”—but what does this mean? It refers to the fact that our computers are (for hardware engineering reasons—it's simply easier to tell if a physical component has a any positive or negative charge than it is to measure, with certainty, particular amounts of charge it may hold)III based on logic in the form of binary mathematics. How then, do we go from such transistors to numbers to words, photos, videos, and sounds? This is through symbolic interpretation. One sets up a way of storing any of these qualia, called a data structure (say for text or images), in binary, and then the computer “knows” (by use of an algorithm in the software) how to translate this information into colors on the screen or sounds out of your speaker hardware (most readers will be most familiar with this concept by file types / file extensions: if your computer treats a bag of numbers as a .txt file, it will interpret that binary storage as text; if it treats it as a .jpg, it will interpret it as an image). The details of such data structures and algorithms is part of what a computer science major studies at university, and what low-level engineers must concern themselves with to write the programs that read and display, say, video files (your Netflix app, for example).
〰〰
Arithmetic tends to be quite straight-forward in programming languages, and many of the normal symbols are used (+, -, *, /).IV One mathematical function you likely secretly learned in grade-school makes an important return in computer science because it is surprisingly useful in figuring out if a number is odd or even, or otherwise cycling values back around beyond a certain limit. When you first learned division, before you learned long division, you would actually perform integer division and provide your grade-school teacher with both the whole-number quotient and the remainder. This remainder is what is returned in the operation called modulo (or modulus), and in computer science, the operator is typically (and strangely) the percent sign, thus:
10 % 2 = 0
means “ten mod two equals zero,” because 2 goes into
10 an even number of times (5) with no remainder (0),
whereas:
because 3 goes into 10 three times for a total of 9, and
therefore a remainder of 10 – 9 = 1. A common way to check if a
number is even or odd is to mod by two: if a number mod two is zero,
it's even, but if it has a remainder, it's odd (because even numbers
are evenly divisible in half, and odd numbers are not). Take the
following example:
Because 2 goes into 20 an even number of times (10) with no remainder (0), 20 is even. Now, an opposing example:
5 % 2 = 1
Because 2 goes into 5 twice for a total of 4, and a remainder of 1, 5 is odd.
By convention, in logic, variables tend to take capital early alphabet letters like A, B, and C. In mathematics, these tend to be lowercase letters from around the alphabet with x, y, and z often on hand. In computer programming, the programmer names their variables whatever they want, and because they often solve real world purposes (such as calculating the tax on a transaction), they tend to have full-word names (following certain language-specific syntax rules). For starters, variables in programming languages tend to be typed, meaning that a particular variable cannot hold / be interpreted just any kind of data (even though it is all binary at base, the data structure and algorithms used to evaluate the variables matter). Some basic types for a variable are a bool (for holding binary true-or-false values, or single-bit binary numbers);V an int (for holding integers); a float (for holding real numbers); and a string for holding text, which is denoted as a string of characters surrounded by double quotes known as a string-literal. An int cannot hold the value "cat" anymore than a string may hold the value 19, because the CPU sets aside a particular amount of RAM for a given variable based on its type, and uses a particular algorithm to interpret its data structure; this is how the computer “knows” to convert that binary number into a base-10 integer or a string of text, depending on the context. Take the following lines of code:VI
float price;
price = 9.99f;
float taxedPrice = price * 1.07f;
The
line “float price;” is a variable declaration—it
tells the computer to set aside some space in memory (RAM) to hold a
number of type float
(a floating-point
number is a fancy way of saying a decimal or real,
fractional number). In many languages, the semicolon is used to
denote the end of a statement (technically, one could write all of
their code on one line as long as each statement was separated by a
semi-colon, as most language's compilers or
interpreters—the
algorithms which allows the processor to actually run your high-level
code as binary machine language—do
not care about whitespace characters, with the very notable exception
of Python). The line “price = 9.99f;” is the assignment of the
value 9.99 to the variable price; now when price is referred to in
code, the computer knows to sub-in 9.99. The line “float
taxedPrice = price * 1.07f;” is doing a few things at once: it is
an initialization because
it declares
and sets a value to a
variable at once (under the hood, it of course is occurring
sequentially in the processor in both cases), and it is setting the
variable not to a literal value but to an expression which evaluates
to 9.99 ×
1.07, or 10.6893.VII Similarly, programmers may also create constants (whose values
cannot be changed after initialization), and whose names are by
convention fully capitalized:
I believe all of this demystifies variables from other fields, like logic's A and math's x, because school-children are often demanded to take the names and values of such variables as given for a particular homework or test problem without being asked what they are, where they come from, or why logicians and mathematicians find them useful. Without the need for the inherently difficult and rigorous work of a logician or mathematician attempting to establish the next geometric proof, computer programming puts the student in the seat of the creator of the variable, of its name, and of setting its values for a practical purpose, thus shedding light on the underlying formal systems at play.
This is only more true of functions.VIII K-12 students only encounter functions from time to time, and I know from experience that it can feel like a fairly contained exercise: a puzzle to be solved but whose use is not self-evident. Given that f(x) = x2, a student may be able to tell you that f(2) = 22 = 4, but what foes “f” mean? Why the parenthetical notation? What is x and why can it take almost any value? It turns out that “f” literally stands for “function” because you were being taught about functions purely in the abstract (with g(x) and h(x) simply being the next couple of letters in the alphabet, for when multiple functions needed to be defined at the same time). But the concept of functions is quite useful, and when they have meaningful names and contents, their power becomes obvious: once again, computer programming takes you out of the seat of mere calculator and puts you into the seat of the author. Consider the following function:
float FinalPrice(float price, float taxRate){
float product = price * (1f + taxRate);
return
product;
}
This is not actually the simplest way to write this function, but it is instructive at this early stage. The first line you see is called a function signature—this establishes what kind of value the function will return (in this case, a float), what its name is (in this case, “FinalPrice”),IX and names any inputs otherwise known as parameters or formal arguments (in this case, one passes in two floats, a “price” and a “tax rate”). The opening and closing braces surround the body of the function (itself indented by convention)—which calculation actually goes on inside, which in this case consists of the calculation of the final price and a return statement which submits the answer back to the caller. If one actually wanted to use this function in code, later, they would call it thus:
FinalPrice(9.99f, 0.07f);
a call that would return, again, 10.6893.
To
make this clear, let's recall the grade-school mathematics example. The functions name was f, its parameter was
a real number x, and its body was x2. All of that was set up by the educators and test-makers, relegating
the student to simply calling
the function, or performing the calculation. Here is how it would
look if you were to define that function in a computer programming
language:
float f(float x){
return x * x;
}
which would be called thus:
As a brief aside, most programming languages allow you to place comments around so that yourself and others know what the code in a particular region is doing (the computer's compiler or interpreter, and ultimately the CPU itself, then ignores this text as though it were whitespace: it is only for helping human programmers navigate a code file). In many languages, this is denoted by two forward slashes “//” or contained within an opening forward-slash-asterisk and a closing asterisk-forward-slash, for multi-line comments: “/* */”:
/* The following code is
* a basic example of a function
* which squares a number. * /
float Square(float squareRoot){
return
squareRoot * squareRoot;
}
Notice that, as programmer, I changed the silly grade-school placeholder name “f” to a more useful name for our function: “Square”; conversely, I changed the name of the parameter from “x” to “squareRoot”, because technically when you square a number, the original number is the square root of the new number. Even without a comment, at a glance, another programmer will much more quickly understand what my function is and what it does (you can imagine this becomes more important as functions become far more complex).
Before our next topic, we
will return to variables for a moment. A special type of variable
which achieves being a collection of values is the array:
one can make an array of variables of any type they choose. Let's
say one wants to have a list of names, say of their class' students'
names. They might go about this like so:
Console.WriteLine(sudentNames[2]);
Zack
Maggie
Katie
Steve
This is very similar to our collection of rectangle widths in the “Mathematics” chapter—there, we used a subscript to pull out an element of the list from a particular index, but here in computer science, we use the square-bracket notation instead. In math, studentNames[1] might look, by mere convention, more like s1. Modern programming languages have several types which achieve such collections of information, but the array is the oldest and simplest.
Now, collections are not all
that useful a data structure without appropriate algorithms
for unlocking their power. Imagine one teaches a giant lecture hall
of two-hundred students—it's great not to have to make and keep
track of two-hundred separate string variables for each student, but
getting, setting, and all-around processing an array of size
two-hundred still sounds tedious! Well, there is a way to execute
similar code (such as retrieving and printing separate strings) over
and over again while only needing to write the relevant command once:
loops are the simplest way to achieve such iteration. Imagine you already have an array of two-hundred student's names. Instead of writing two-hundred print-statements to print them all
out, one could write a loop:
for(int i = 0; i <
studentNames.Length;
i++){
Console.WriteLine(studentNames[i]);
}
The
first line is the signature of the loop—the “for” denotes that
this this is a for-loop, a loop that runs for a set number of
iterations. Inside of the parentheses are the parameters for
the for-loop, first the loop-control-variable is (in this case
declared and) set to a value (in this case, 0). Next, the
loop is given a condition for continuing—in this case, the loop
will continue to iterate until the loop-control-variable, i,
is one less than the length of the studentNames array. Finally, there is
the incrementer: at the end of each iteration, the loop
control variable will increase by one, thereby pointing to (indexing)
the next element in the array. The code inside of the braces
is, again, the body of the loop—the code which runs once per
iteration, using the value of i at that time; this code would print
out two-hundred lines, each a name of a student in the class. There
are other kinds of loops, but a for-loop may handle anything they
could, with more-or-less clarity.
There are
special kinds of functions in math and computer science known as
recursive functions because they are defined in such a way as
to call themselves. No function must be defined in recursive
terms (the classic form of a function is known as an in-line
function), but some functions may be more easily understood in this
way or simply require fewer lines of code to implement (but beware,
due to RAM and ISA constraints, the trade-off is such that recursive
definitions are always less efficient for a computer to run). Take for example a function for raising a number to a power (for
simplicity's sake, we will only allow positive integer exponents). First, the in-line version (which usually use loops to achieve
iteration, in place of recursion):
float PositivePower(float
baseNumber,
int exponent){
// Return special error value if a negative exponent
is used
if(exponent < 0){
return -1f;
}
/*
Set the answer to the lowest possible answer, 1,
* in the
case of an exponent of 0. */
float runningTotal = 1f;
//
Multiply the running total by the base exponent times
for(int
i = 0; i < exponent; i++){
runningTotal = runningTotal *
baseNumber;
}
return runningTotal;
}
This
pretty simply cycles through a loop for as many iterations as the
exponent calls for, each time multiplying the running-total by the
base: if you call PositivePower(5f, 3), it will essentially calculate
1 * 5 * 5 * 5 = 125 (which it will then return to the caller). Now
let's take a look at a recursive version which achieves
iteration without a loop:X
float
PositivePower(float base, int exponent){
// Return special error value if a negative exponent
is used
if(exponent < 0){
return -1f;
}
// Return 1 if the
exponent is 0
if(exponent == 0){
return
1f;
}
// Base case: if the exponent has been
reduced to 1, return the base
if(exponent == 1){
return
base;
}else{
return base * PositivePower(base,
exponent – 1);
}
}
In this case, it might be a
little more complicated to write this function in terms of recursion
(I confess, recursion has never been my strong-suit, so it
seldom looks better to me!). But with a little thought, you can see
how it works. Essentially, when called (with an exponent greater
than one), the function calls itself in the line:
return base * PositivePower(base, exponenet – 1);
The “return” will have to wait, as will the “base *” part, because the function call
PositivePower(base, exponent – 1)
will be run, first, each
iteration until the base case is met, and the result may propagate
back upward. If one calls PositivePower(5f, 3) on this function,
it will effectively drill down, first running these three function
calls:
return 5 * PositivePower(5,2);
return
5 * PositivePower(5, 1);
Upon
reaching the bottom—the PositivePower(5, 1) call, the “if(exponent
== 1)” conditional will be caught, putting an end to the recursion
by merely returning the base (that is, 5). It returns this base to
the previous caller (which was the previous iteration of the function
itself), thus:
return 5 * PositivePower(5,
1);
becomes:
return 5 * 5
because 5 was returned
up to this layer. Then, 5 * 5 = 25 is calculated, and 25 is in turn
passed up to the next layer, such that:
return 5 *
PositivePower(5,2);
becomes:
return
5 * 25
which evaluates to 5 * 25 = 125, and finally returns 125 to
the original caller, recursion
complete. Recursion
has an interesting
self-similarity feature, related to geometric fractals
(shapes whose smaller
features, when zoomed up on, turn out to look very much like their
larger features—potentially infinitely; something like this is true
of the complicated shape of a seashore boundary).XI Of course when it comes to computers, one has made a mistake if
either their loop or
recursive function fails to terminate: such infinite loops will crash
a computer, leaving them in a state in which they cannot get anything
done, endlessly spinning their wheels. Abstract mathematics aside,
physical objects approximated by fractal geometry, too, must
terminate for similar reasons: at some point, the self-similar
features one is zooming in on will be as small as the smallest
possible objects, fundamental particles (and due to differences in
the forces dominant in these physical regimes, any fractal pattern is
likely to terminate much sooner, on much larger scales than that).
But
we're
all struck with the sense that computers and their programs are not
like movies or slideshows: they don't do the same thing
every time we boot them up. When you use a calculator app, one may
run the same program, but input different calculations to be done. A
more complex version of
the same phenomenon is that when you play the video game, Super
Mario 64, you can decide which
level to go to, how long to play, and where and when to run and jump
from moment to moment—how does it do
this!? The answer ultimately lies in conditional
statements. Very often a
computer executes some block of code if some
condition is met but some other block
of code if other conditions are met (in the example of Mario,
the player's character only jumps when the button for doing so is
pressed). Let's demonstrate this with the much simpler example of a
machine telling us whether a number is odd or even:
void
OddOrEven(int number){
if(number % 2 == 0){
Console.WriteLine("The number " + number + " is even.");
}else{
Console.WriteLine("The number " + number + " is odd.");
}
}
For starters, notice the return type is “void”— this means the function or subroutine doesn't actually return a value, it simply performs some code in-place (the caller does not expect anything in return). What you see next in the body is called an if-statement, the simplest of all conditionals. It simply means that if the Boolean statement inside of the parentheses evaluates to true, you execute the immediate code-block, and otherwise (else), you execute the other code block, instead. Similarly, the Mario video game checks if the player has pressed the jump button to execute the jump code block (and otherwise does not do so).
We
should now address that in addition to mathematical operators,
Boolean expressions may use logical operators: NOT, EQUIVALENCE,
GREATER-THAN, LESS-THAN, GREATER-THAN-OR-EQUAL-TO,
LESS-THAN-OR-EQUAL-TO,
AND, OR, and XOR (we briefly went over NOT in the “Logic”
chapter). While it is not the only place these are useful, Boolean
expressions (expressions which evaluate to either true or false,
rather than to a numerical answer) are very often found in the
conditional locations such as a for-loop's signature or an
if-statement. In the above example, we saw the double-equals-sign,
or EQUIVALENCE operator (==) which checks whether a value is
equivalent to some other value (true) or not (false)—this is not to
be confused with the single-equals-sign assignment operator (=),
which sets a variable to a value. The NOT operator (as briefly
mentioned in the “Logic” chapter), is the exclamation point (!),
and it simply inverts expressions that would otherwise return true,
to returning false (and vice versa). One may check if a variable is
NOT equal to a given value in a couple of ways by applying the NOT
operator—here are two:
int a = 10;
bool answerOne =
!(a == 11);
bool answerTwo = a != 11;
This
small program sets an integer (a)
to
the value of 10, and then sets two Boolean variables (answerOne and
answerTwo) equal to two equivalent expressions. Just as mathematical
operators have an order-of-operations, so too do logical operators,
and in both cases, parentheticals come first and foremost, so that
the expression answerOne is set to evaluates thus:
!(a ==
11)
!(10 == 11)
!(false)
true
so that
answerOne is set to true (because x does NOT equal 11). Here, a NOT
was applied to the entire expression at the end, but one could as
easily apply the NOT to the operator (or think of the
exclamation-point-equals-sign (!=) as a separate, NOT-EQUIVALENT
operator); this is what is done with the expression answerTwo is set
to:
a != 11
10 != 11
true
in
which it is directly asked
whether a is NOT-EQUIVALENT to eleven, and directly answered as true. If one understands the EQUIVALENCE operator, then the inequality
operators (>, <, >=, <=)XII
are fairly straightforward and a simple demonstration should
elucidate them:
int w = 1
int x = 10
int y
= 100
int z = 100
bool a = w < x
bool b = z > y
bool c = z >= y
bool d = w <= z
Here,
the expression assigned to the variable, a, evaluates to (1 LESS-THAN
10), which is of course true. That for b evaluates to (100
GREATER-THAN 100), which is false (they're equivalent!). For c, (100
GREATER-THAN-OR-EQUAL-TO 100) is true because they are equal. Finally, for d, (1 LESS-THAN-OR-EQUAL-TO 100) is of course true. Other logical operators tend to combine multiple separate Boolean
expressions in different ways. The AND operation combines two
Boolean expressions into one by asking if both evaluate
to true, in which case the entire AND expression evaluates to true
(but otherwise, if only one expression is true, or both are false,
the AND expression evaluates to false). The tendency in computer
programming is for the double-ampersand to denote the AND operation
(&&):
bool a = true;
bool b =
false;
bool c = true;
bool
bAndA = b && a
bool aAndC = a && c
Here
bAndA is set to false because only (true
AND true) equals true, but (b AND a) evaluates to (false AND true),
which is false.XIII On the other hand, aAndC is set to true because both a and c are
true, meaning the expression (a AND c) evaluates to (true AND true),
which equals true. The OR operation, denoted by a double-line symbol
(||) combines two expressions by checking if either or of
the values is true, and returning true if so (meaning that the only
way for an OR statement to
evaluate to false is if both terms
are false):
bool thisOrThat = (10 == 10) || (5 >
20)
bool hereOrThere = (15 == 2) || false
Here,
the Boolean expression assigned to thisOrThat evaluates to (true OR
false), which is true.XIV However, the Boolean expression assigned to hereOrThere evaluates to
(false OR false), which is of course false. Our final operator is
the somewhat rarely used EXCLUSIVE-OR, or XOR (denoted by the caret:
^).XV This quirky logical operation yields true when only one
of the two expressions is true (if both are true, or both are false,
it yields false).XVI Here are some examples:
bool thisXorThat = (400 == 401) ^ (5
== 5)
bool hereXorThere = true ^ (3 > 1)
Here, the
expression assigned to thisXorThat evaluates to (false XOR true),
which yields true overall. But the expression assigned to
hereXorThere evaluates to (true XOR true), which is not
true because the “exclusive”
in EXCLUSIVE-OR means that only one
operand may be true for the XOR to evaluate to true.
It's worth briefly discussing scope: a given variable or function exists in a particular context in which it may be accessed (but there may be other regions of code from which it has privacy). Generally, the scope of one's variable or function may be global or some shade of local. Typically, programs have some special function that is run as a starting point (from which other functions may be called), which is usually called “Main”. This Main function exists at the outermost layer of the code-file, meaning that it is relatively global in scope: other functions or variables defined in this file may be accessed from anywhere in the file by merely typing out the feature's name. One may provide an explicit access modifier, though (the implicit version of which is “private”) to define whether these features may be accessed by other files—“private” in this context means private to the file, public means accessible even from other files:XVII
// Private functions may only be accessed by code written in the same file
private static void
Main(){
Console.WriteLine(Square(1.5f));
}
// Public functions may be accessed even by other code-files
public static float Square(float squareRoot){
return squareRoot * squareRoot;
}
To put a fine point on it: any function (such as Square) in this file may call Main (though Main should really never be called outside of being the main entry-point) or Square, but a function in another file may only call Square (because it's public), and not Main (because it's private). The same is true for variables:XVIII
// Private variables may only be accessed by code written in the same file
private static string message = “Hello world!”;
// Public variables may be accessed even by other code-files
public const float SPEED_OF_LIGHT = 3E8f;
Now, variables (unlike functions) may also be declared inside of a function: these are truly local variables, and they may only be accessed (or considered to exist) inside of the function in which they were declared. Whereas variables and functions of the same file-and-scope may not have the same name (constituting a naming conflict), variables of different file-or-scope may step on each other's names (but doing so means that the local context can only access the locally named version and not the global one, short of using some extra syntax; it's generally best to avoid naming a local variable the same thing as a global variable, though doing so is not entirely uncommon). Consider the following code:
// Global variable
static int population = 150;
// Program entry point
private static void
Main(){
Console.WriteLine(TotalCatsOne(2));
Console.WriteLine(TotalCatsTwo(2));
}
/* TotalCatsOne takes in a number of cats per person
and calculates the
* total number of cats, given the
population. */
private static int TotalCatsOne(int catsPerPerson){
return population * catsPerPerson;
}
/* TotalCatsTwo takes in a number of cats
per person and calculates the
* total number of cats, given
the population. */
private static int TotalCatsTwo(int catsPerPerson){
/* Creating a new local variable and denying
access
* within this function to the global variable of the
* same name. */
int population = 200;
return population * catsPerPerson;
}
One
would be mistaken to think that when Main calls each function, they
will both print out “300”. The first function call, given
TotalCatsOne has no local variable named “population”, will use
the global variable and calculate 150 * 2 = 300 (thus printing “300”
to the console). The second function call, given TotalCatsTwo does
override the global population
variable with a local variable of the same name, will instead use
that variable,
calculating 200 * 2 = 400 (and thus printing “400”). So the
overall output after Main runs is:
This is why it is usually not smart to name local variables the same as global variables: you bar local functions easy access to the global variable, and may confuse yourself over what values are being used!
The concept of scope is quite interesting when applied to the philosophy of science. From the perspective of ontology, one might think that every variable is essentially global in scope—as would be true for an omniscient god. But in truth, there are regions of the universe with little contact with others (and as a result, which are “aware” of little-to-no information from that other region)—in fact, the universe has expanded for longer than all regions have been able to come into causal contact since about 13.8 billions years ago: it is physically impossible for these regions to know anything about each other that has changed in the intervening time since they were in contact at the very beginning of the big bang. Even more interesting is the application of scope to epistemology: as homo sapiens, we are a part of the cosmos trying to understand the cosmos—and we do so with only limited information (though the process of science reveals ever more to us, and more importantly, explains that information). Engineers have to deal with this at a very practical level. For example, in a video game, the “ontology” of the game-world is effectively omniscient (to the degree this is not true, it is because the programmers explicitly used scope tools such as different files and access modifiers to separate the game's different parts). In principle, a city in Grand Theft Auto III could be coded such that each car knows the position, current speed, even the color of every other car on the road! In reality, as companies such as Tesla attempt to create self-driving cars, engineers have to deal with the fact that they cannot just dot-access any old car to the right or left of them (let alone on the other side of town) to assess their positions, speeds, and accelerations. Instead, they are fitted with all sorts of sensors (mostly specialized cameras being analyzed with artificial intelligence, or AI) to sense nearby vehicles and measure their properties before reacting to them. We are really no different! Without a brain, a stone doesn't know anything about its surroundings, but it may indeed be scarred by an information signature (for example, we can carbon-date objects by measuring the relative prevalence of known radioactive chemicals, and their byproducts, inside of them). Life such as humans evolved fallible sense organs and brains (like the cameras and AI in the self-driving car) to learn those things about nature on the fly that they cannot simply intuit, like an omniscient god could. (Interestingly, we could get closer to the Grand Theft Auto scenario in the real world by having the cars wirelessly communicate their own self-knowledge to each other, such as speed which is itself measured by a detector at the wheel, or location which involves triangulation based on at least three measurements from three different satellites in the Global Positioning System, or GPS). Even self-driving cars need to make measurements, that is, they have to make do with a kind of epistemology: no facet of ontology has direct access to understanding even itself.
I
would be remiss if I did not mention object-oriented
programming (OOP). Most
modern-day languages, including C# and Python are object-oriented. OOP is a feature which allows programmers much more control over the
data-structures in their programs: as with recursion and
loops, there is nothing one can do with objects that they could not
in-principle do without them, but there are many problems that such a
feature makes far more tractable, in-practice.XIX Not only does OOP code tend to be more organized, but any time one
wants to simulate a large collection of similar objects (especially
complex objects), this additional organization may likely make the
difference between one actually pulling their project off. The idea
at the core of OOP is to introduce the concept of
objects—this is actually a
special kind of variable type. The types that we learned about
earlier (bool, int, float, etc.) are called primitives:
they are the simplest data types. But the type object can
be extended to make one's own
less-primitive data types by combining multiple primitive variables
into one. The blueprints for these new types of objects
are called classes. Classes contain
fields,
which may be be both variables
(of primitive or other
object types) and
functions (in this context, called a method),
which one may dot-access, as you will see, shortly. Here is an
example of a class:
class Person : object{
//
Variables
public string firstName;
public string lastName;
public int
age;
public float height;
public float weight;
// Constructor instantiates variables of this type
public Person(string newFirstName, string newLastName,
int newAge, float newHeight, float
newWeight){
firstName = newFirstName;
lastName =
newLastName;
age = newAge;
height = newHeight;
weight = newWeight;
}
// Methods / Functions
public void IntroduceOneself(){
Console.WriteLine("Hello, my name is
" + firstName +
", and I am " + age + "
years old.");
}
}
Here, our new data type (or object type) is called Person. As such, each Person comes with a first and last name, age, height, and weight. Every object also comes with a special function called a constructor—this is the code for instantiating (or creating an instance of) your object (the equivalent of initialization in primitives). Finally, each Person comes with a function in which a print-out introduces the Person by name. The following code (located in another class or file) is how one would instantiate two people, and manipulate these objects:
class
Program : Object{
private static void Main(){
// Create
a new Person, representing Steve
Person steve = new Person("Steven",
"Gussman", 28, 75.75f, 225f);
//
Create a new Person, representing Jake
Person jake = new Person("Jake", "Gussman", 16, 74.5f, 210f);
// Dot access jake's age, to change it
jake.age = 26;
// Introduce both people by dot-accessing their introduction functions
steve.IntroduceOneself();
jake.IntroduceOneself();
}
}
As
one can see, with objects, one may dot-access an instance's fields to
manipulate said object.XX The above code would print out:
Hello, my name is Steven, and I
am 28 years old.
Hello, my name is Jake, and I am 26 years old.
reflecting that after instantiating jake, we dot-accessed the age to correct it, before calling on him to introduce himself.
〰〰
The wider application of these concepts outside of mere personal computers is known as the philosophy of computation, systems-thinking, or computational thinking. Even with just these basics, one can already begin to see how the worlds of video games are defined by deterministic computer code. And once you see this, you naturally wonder if the real world itself could be modeled the same way: as functions conditioned on the values of variables and constants (inputs, or arguments). In fact, in conjunction with a bit of an education in physics, you will quickly find yourself wondering how it could be any other way.
Footnotes:
0. The Philosophy Of Science table of contents can be found, here (footnotephysicist.blogspot.com/2022/04/table-of-contents-philosophy-of-science.html).
I. See Our Mathematical Universe by Tegmark (pp. 335).
II. See How To Create A Mind by Kurzweil (pp. 188).
III. See Starting Out With Visual C#: Fourth Edition by Tony Gaddis (Pearson Education) (2012 / 2014 / 2017) (pp. 7) (though I have not finished this book).
IV. The asterisk (*) is used to represent the multiplication operator in the present “Computation” chapter whereas the cross (×) is used in the “Mathematics” chapter because most programming languages (including C#) use the asterisk (which is easily accessed on any keyboard).
V. The computer science terms “bool” and “boolean” refer to logician George Bool, whose early work on logic and binary mathematics turned out to be the foundation of computer science, later on.
VI. There are so-called weakly-typed languages which do indeed allow more freedom in the use of variables for different purposes in succession. However, this can be very hard for a programmer to keep track of, and almost inevitably leads to computer programs crashing as algorithms and data-structures mismatch and cause unintended behavior in the CPU.
VII. Conventions such as variable naming are largely just that, and not technically part of the syntax of the language, in that the compiler / interpreter does not care whether you capitalize a letter or not (focusing instead on reserving keywords and characters that are not allowed to be used in variable names at all). Here, with “taxedPrice”, we see the camel-case convention for variable names: the name starts with a lower-case letter, but each subsequent new word in the variable name is capitalized like a camel's humps.
VIII. In computer science, functions may also be called methods, sub-routines, or co-routines (the lattermost being a special kind of function which runs in tandem alongside other code rather than the calling code waiting for the function to complete), but in the interest of unity with mathematics, we will tend to call them all functions, here.
IX. Here I have used the C# convention for function names in which even the first letter of a function name is capitalized (to differentiate functions / methods from variables / fields at a glance).
X. Don't fret over the “if” statements that have begun to show up, for these will be discussed shortly.
XI. See The Physics Of Wall Street: A Brief History Of Predicting The Unpredictable by James Owen Weatherall (Houghton Mifflin Harcourt) (2013) (pp. 54-55)
XII. In mathematics, the “or-equal-to” versions of the operators are denoted by a line under the relevant symbol (≥ and ≤ stand for GREATER-THAN-OR-EQUAL-TO, and LESS-THAN-OR-EQUAL-TO, respectively), whereas in computer programming, an equals sign is simply appended to the relevant symbol for ease of use with standard keyboards (>= and <=).
XIII. In fact, because this is an AND statement in which the first expression evaluates to false, it is known as a short-circuit solution, because the second expression (the one to the right of the &&) doesn't even need to be evaluated, as the moment there exists any false expression in an AND statement, the final answer must be false. Computers are designed to exploit this efficiency—they really do not waste resources like processing power and time evaluating the second expression in the case of a short-circuit solution (and you shouldn't, either).
XIV. Again, this is actually a short-circuit solution as the moment there is a true evaluation in an OR statement, the overall answer must be true with no need to evaluate the expression after the (||) operator.
XV. Note that only a single caret is used because there is no short-circuit solution for the XOR operation; the answer always depends on both values.
XVI. The nature of this expression is such that there is no short-circuit solution, as the answer always depends on both operands.
XVII. As we will see in the later discussion on object-oriented programming, there is a subtly here in object-oriented languages, such as C#. The file is not the unit of importance, per se, but what is known as a “class”. Suffice it to say that most of the time, a file contains only one “class” and so their interchangeable use is fine, but often enough, multiple classes are defined in the same file—these classes will not be able to access each other's “private” fields. Further, the reader may have noticed the “static” keyword in the function signatures, here. This is because while all code exists inside of some class, not all classes are instantiated as objects. The outermost class where the Main function lives is a static class with static functions (such as Main) which allows they be called and run without another class first declaring and instantiating an object of that class' type. Again, this will make more sense after reading the upcoming section on object-oriented programming.
XVIII. The “const” keyword denotes that a “variable” is actually a constant—it cannot be changed after initialization (by convention, these are written in all-caps, with underscores between words in the name). Constants cannot be marked static, they ought to be treated that way by default, as they cannot have their value changed (they are read-only). In C#, one can use scientific notation with floating point variable types—here, as in mathematics, “3E8” is equivalent to 3 × 108.
XIX. The dichotomy of in-principle and in-practice will be further explored in “In-Principle And In-Practice” chapter.
XX. Object-oriented programming evolved out of an older feature of more primitive languages, such as C, called a struct. Structs were pieces of memory one set aside and filled with a collection of primitive variables which could be dot-accessed from the struct's name (sound familiar?). It is likely that most of this evolution occurred in the lineage from primitive C to object-oriented C++.
Change Log:
ReplyDeleteVersion 0.01 9/8/22 9:38 PM
- Fixed malfunctioning footnote links
Version 0.02 9/8/22 9:43 PM
Delete- Fixed footnotes linking out to the same-numbered footnote of the wrong chapter
Version 1.00 1/8/23 1:56 AM
Delete- Fixes:
"CH 5
BODY AND FNs [CHECK]
Text that isn't console should be Times New Roman
FN 1 [CHECK]
Tegmark 335, remove Google"
- Changed title to "1st Edition"
Version 1.01 2/12/23 1:27 PM
Delete- Substantive fixes in line with the Print Version 1.02