Monday, April 29, 2013

Critical Thinking: The Scientific Method in 5 Steps

Introduction to the Scientific Method in the Context of Critical Thinking

(For an example of real science in action, watch the video)
In the last few lessons we've looked at 5 common argument schemes:  Generalizations, polling, general causal reasoning, particular causal reasoning, and arguments from ignorance.  As luck would have it, these are the most common argument schemes you will find in (good and bad) scientific arguments.  Arguments are important to the scientific enterprise because a core activity of science is to provide reasons and evidence (i.e., arguments) for why one hypothesis should be accepted over another.  This question of why we should chose one hypothesis over another (or any hypothesis at all) brings up many interesting philosophical issues which (time permitting) we will briefly explore.  However, before putting on our philosopher hats, lets put on our lab coats, turn on our bunsen burners and take a closer look at the scientific method.

We can break up the scientific method into 5 steps:

Step 1:  Understanding the Issue
In this first step, the goal is simply to determine what it is exactly that we want to know.  Usually, it will be a problem that we want solved.  Examples might include, what is the mass of an electron?  Can vaccines prevent measles?  Can Tibetan monks levitate?  Is the earth round?  Can wi-fi cause health problems?  Does the color red make people feel hungry? How do magnets work?  Does honey diminish the severity of coughs?

As you can see some of these issues will involve questions about causation while others might be about identifying something's properties.

Step 2:  Formulating a Hypothesis
In the next step, we want to formulate a hypothesis that will solve our problem and the hypothesis must be testable (recall that a non-testable hypothesis is non-falsifiable and thus considered pseudo-scientific).  

To illustrate how this works, lets consider the problem of whether honey diminishes the severity of coughs.  Our basic hypothesis will be "honey diminishes the severity of coughs."

However, often our hypothesis will extend beyond a simple "yes" or "no".  We will want to know why it does or doesn't have a particular effect on a cough.  This is known as the "causal mechanism";  i.e., the thing that causes the effect that our hypothesis anticipates.   So, if honey diminishes the severity of coughs, we will want to know why.  If we don't know why then it may simply be correlation.  We are trying to establish causation.  Maybe it's the tea we drink the honey with that causes the diminished severity.  Or maybe it isn't honey itself that causes the reduced severity  maybe it's the sugars in honey and so any sweet substance will do.

Part of establishing causation is to rule out competing hypothesis.  So, if someone says that honey diminishes the severity of coughs because the sweetness in honey activates some particular receptor cells that in turn help diminish the severity of the cough, then we can test that.  Someone else might say it's because the honey reduces swelling in the throat.   We can test that a too.  Or someone else might say honey has some anti-bacterial or anti-viral compounds which kill the bacterial/viral cause of the cough.

The point is, we need to pick a hypothesis that (preferably) is also specific enough to also include a causal mechanism.  Lets choose the first one.

Hypothesis (h):  Drinking honey can reduce the severity of coughs.
Causal Mechanism:  h because

"the close anatomic relationship between the sensory nerve fibers that initiate cough and the gustatory nerve fibers that taste sweetness, an interaction between these fibers may produce an antitussive effect of sweet substances via a central nervous system mechanism."

Fallacy Alert!  Aruga!  Aruga!  In scientific debates it's very important to hold your opponent to their hypothesis (and also to keep to yours when facing objections or contravening evidence).  Changing the hypothesis mid-debate is called moving the goal posts.  This is a very common practice among purveyors of pseudo-science or members of the anti-science ideologies.  

For example, for years anti-vax groups opposed vaccines because--they hypothesized--thimerisol causes autism.   Because this myth became so pervasive (despite overwhelming evidence to the contrary) and in order to ensure compliance rates high enough for herd immunity, many national health departments changed to the more expensive thimerisol-free versions of the vaccines.  Contrary to the anti-vax hypothesis, removal of thimerisol from vaccines was followed by autism rates actually going up rather than down! (There's some weak evidence to suggest that vaccines can actually inhibit some kinds of autism).

Now that thimerisol is removed from vaccines and the anti-vax hypothesis has been proven to be empirically false, what do you think the response of the anti-vax crowd is?  If you guessed, "oh, lets support vaccines now," you were sleeping for the 2 weeks of this class!  The response was to "move the goal posts."  Now it's "too many too soon!" or "it's got aluminum in it!" or "it's got mercury in it!".

Step 3:  Identifying the Implications of the Hypothesis
In the next step we need to set out our expectations for what we'd expect to see (i.e., observations) if your hypothesis is correct.  It's very important that this is done before the experiments are conducted.  In the case of the honey, we'd expect to see that (a statistically significant number of) people who have a cough will cough less frequently and violently then a comparable group of people with a cough but who don't take honey (or any other "medicine").  In the case of thimerisol, we might say, if it's true that thimersol causes autism, then when we remove thimersol from vaccines we should expect to see autism rates decline. 

We can formalize this structure

If the hypothesis (h) is true, then x will occur.  (x is our expected observable outcome).  

So, in the case of honey, if our hypothesis is true, then those who drink honey will have reduced severity of coughing compared to a control group.

Step 4:  Testing the Hypothesis
As you might expect, once we've set up our hypothesis and established the anticipated observable effects that would confirm the hypothesis, we test!

Recall step 2: when we form the hypothesis, we should ensure that the hypothesis is testable.  That is to say, that we can say in advance what will constitute observable confirmation or disconfirmation of the test.  A couple of notes on why we must do this in advance.  (1).  This prevents retrofitting the data to fit the hypothesis; (2).  if prevents the "moving of the goal posts".

Testing in Principle vs Testing in Practice
Finally, we should be aware that not all hypotheses will in practice be testable, but they must be so in principle.  For example, we can construct a hypothesis of what will happen if a large asteroid hits the earth but we don't need to actually destroy half the earth to confirm the hypothesis that such an impact will indeed destroy half the earth.  In some cases, running a computer simulation will do! 

Step 5:  Reevaluating the Hypothesis 
In step 4, I emphasized that the predicted confirmatory results of the hypothesis must be made in advance to avoid retrofitting and moving the goal posts.  However, this does not mean that once we have conducted a test that we can't modify the test or the hypothesis.  This is perfectly legitimate  but must be done in a way that recognizes the shortcomings of the original test and/or hypothesis.  

Fallacy Alert!  Aruga!  Aruga!  Aruga! When the implications of our hypothesis are confirmed we must be careful not to immediately conclude that our hypothesis is confirmed.  From the fact that our anticipated effect occurred it doesn't necessarily follow that our hypothesis is true.  This is called the fallacy of affirming the consequent, which looks like this:  

P1  If h, then x.  (in fancy talk, h is called the anticedent and x is called the consequent)

P2  x occurred.
C   Therefore, h is true. 

To see why h doesn't necessarily follow, given that P2 is true (i.e., "affirming" the consequent), consider the follow case.

P1  If it's raining, it's cloudy.

P2  It's cloudy.
C   Therefore, it's raining.

Just because it's cloudy doesn't mean it's raining.  It can be cloudy without it being rainy.  It can also be partially cloudy with chances of sunshine in the evening, followed by overcast skies at night... you get the point.  

In relation to scientific hypothesis we can imagine the following scenarios:  Someone suggests a hypothesis h and anticipates a certain observable consequence x.  But does it follow that just because x occurred that the hypothesis is true?  Nope.  There are many possible alternative reasons (or causes) besides h for which x might have occurred.  

If we think back to the sections on general causal reasoning we can see why.  If the hypothesis is a causal one, then there are several steps we need to go through before we can attribute causality.  Maybe there's only a statistical relationship between two variables?  (correlation) Maybe, there's some other better explanation (h) for why x is occurring?  Maybe the methodology was flawed.  (No double blinding=placebo effect, problems with representativeness and sample size, etc...).

Summary:  Steps of the Scientific Method
1.  Understand the Problem that requires a solution or explanation.
2.  Formulate a hypothesis to address the problem. 
3.  Deduce the (observable) consequences that will follow if the hypothesis is correct. 
4.  Test the hypothesis to see if the consequences do indeed follow.
5.  Reevaluate (and possibly reformulate) the hypothesis.  


Tuesday, April 23, 2013

Critical Thinking: Arguments from Ignorance, God, and GMOs

The next argument scheme we will look at is what's known as the argument from ignorance.  An argument from ignorance (or argumentum ad ignorantium if you want to be fancy) is one that asserts that something is (most likely) true because there is no good evidence showing that it is false.  It can also be used the other way to argue that a claim is (most likely) false because there's not good evidence to show that it's true.

Lets look at a couple of (valid) examples:

There's no good evidence to show that the ancient Egyptians had digital computeres.  (This evaluation comes from professional archeologists), therefore, they likely didn't have digital computers.


There's no good evidence to suppose the earth will get destroyed by an asteroid tomorrow.  (This evaluation comes from professional astronomers),  so we should assume it won't and plan a picnic for tomorrow.


There's no good geological evidence that there was a world-wide flood event.  (This evaluation comes from professional geologists), therefore we should assume that one never happened.

Formalizing the Argument Scheme
As you may have guessed, we can formalize the structure of the argument from ignorance:
P1:  There's no (good) evidence to disprove (or prove*) the claim.
P2:  There has been a reasonable search for the relevant evidence by whomever is qualified to do so.
C:    Therefore, we should accept the claim as more probable than not/true.
C*:  Therefore, we should reject the claim as improbable/false.

Good and Bad Use of Argument from Ignorance
The argument from ignorance is philosophically interesting because sometimes the same structure can be used to support the opposing position.  The classic example is the debate over the existence of God.  Lets look at how both sides can employ the argument from ignorance to try to support their postion.

Pro-God Arg 
P1:  You can't find any evidence that proves that God or gods don't exist.
P2:  We've made a good attempt to find disconfirming evidence, but can't find any!
C:   Therefore, it's reasonable to suppose that God or gods do exist.

Vs God Arg 
P1:  You can't show any evidence that God or gods do exist.
P1*:  Any evidence you present can also be explained through the natural laws.
P2:  We've made a good attempt at looking for evidence of God's/gods' existence but can't find any! (I even looked under my bed!)
C:  Therefore, it's reasonable to suppose that God/gods don't exist.

This particular case brings out some important issues we studied earlier in the course such as bias and burden of proof.   Not surprisingly, theists will find the first argument convincing while atheists will be convinced by the latter.  This of course brings up questions of burden of proof.  When we make a claim for something's existence, is it up to the person making the claim to provide proof?  Or does the burden of proof fall on the critic to give disconfirming evidence?  In certain questions, your biases will pre-determine your answer.

While in the above issue, there is arguably reasonable disagreement on both sides, there are other domains where the argument from ignorance fails as a good argument.  As you might guess, this will have to do with the acceptability of P1 (i.e., there is/is no evidence) and P2 (i.e., a reasonable search has been made).  Most criticism of arguments from ignorance will focus on P2--that the search wasn't as extensive as the arguer thinks.  Generally, we let P1 stand because it is usually an authors opinion to the best of their own knowledge.  Recall from the chapter on determining what is reasonable, we typically let personal testimony stand.

We can illustrate a poor example of an argument from ignorance with an example.  Claim: There's no evidence to show Obama is American, therefore he isn't American.

Lets dress the argument to evaluate it:
P1:  I've encountered no good evidence to show that Obama is an American citizen.
P2:  Numerous agencies and individual trained in the search and identification of state documents have been unable to locate any relevant documents.
C:   Obama isn't American (and is a Communist Muslim).

Regarding P1, maybe the arguer hasn't encountered any evidence so we'll leave it alone.  P2 however has problems.  There have been reasonable searches for evidence, and that evidence was found.  Perhaps, the arguer was unaware or didn't truly exert him/herself enough.  The argument fails because P2 is not acceptable (i.e., false).

We can also typically find the argument from ignorance used in arguments against new (or relatively new) technologies in regards to safety or efficacy.  For example:

We should ban GMOs because we don't know what long-term health effects are.

P1:  I've found no evidence that shows that GMOs are safe for human consumption.
P2:  Those qualified to do studies and evaluate evidence have found no compelling evidence to show that GMOs are safe for human consumption.
C:  Therefore, we should assume GMOs are unsafe and ban them until we can determine they are safe.

If we were to criticize this argument we'd consider P2.  In fact, there have been quite a few long term studies done by those qualified to assess safety.  At this point we will have a debate over quality of evidence.  Some on the anti-GMO side dispute the quality of the evidence (i.e., it was funded by company x, and therefore it is questionable).  In a full analysis we'd consider this question in depth, but for our purposes here, we might legitimately challenge the claim that there is no available evidence purporting to demonstrate safety.

As an aside, notice that we can also use the argument from ignorance for the opposite conclusion:  There's no compelling evidence to show that GMOs are unsafe for human consumption in the long-term, therefore, we should continue to make them available/ should not regulate them.

The "team" that wins this battle of arguments from ignorance will have much to do with our evaluation of P2:  That there legitimately is or isn't quality evidence one way or the other.

Final Notes on Arguments from Ignorance
We can look at arguments from ignorance as probabilistic arguments.  That is, given that there is little or no evidence for something, what is the likelihood that it still might exist?  This is especially true for claims that something does exist based on an absence of evidence for its non-existence.   However, as Carl Sagan famously said, "absence of evidence is not evidence of absence."  In other words, just because we can't find evidence for something, doesn't mean that the thing or phenomena doesn't exist.

On the flip side, this line of argument can also be used to support improbable claims.  Consider such an argument for the existence of unicorns or small teapots that circle the Sun:  There's no evidence that unicorns don't exist or small tea pots don't circle the Sun, therefore we should assume they exist.

At this point we should return to the notion of probability:  Given no positive evidence for these claims, what is the probability that they are true (versus the probability that they aren't)?  It seems that, given an absence of evidence, the probability of there being unicorns is lower than the probability that they do not exist.  Same goes for the teapot.

Typically, in such cases we say that the burden of proof falls on the person making the existential claim.  That is, if you want to claim that something exists, the burden is upon you to provide evidence for it, otherwise, the reasonable position is the "null hypothesis."  The null hypothesis just means that we assume no entity or phenomena exists unless there is positive evidence for its existence.   In other words, if I want to assert that unicorns exist, using the argument from ignorance won't do.  It's not enough for me to make the claim based on an absence of evidence.  This is because, we'd expect some evidence to have turned up by now if there were unicorns (i.e., P2 of the implied argument would be weak).

This brings us to another Carl Sagan quote (paraphrasing Hume):  "Extraordinary claims require extraordinary evidence." Or as Hume originally said:  "A wise man proportions his beliefs to the evidence."   Claiming that unicorns exist is an extraordinary claim and so we should demand evidence in proportion to the "extraordinariness" of the claim.  This is why an ad ignorantium argument fails here;  it doesn't offer any positive evidence for an extraordinary claim, only absence of evidence.  We'll discuss this principle of proportionality more in the coming section.   For now, just keep it in mind when evaluating existential arguments from ignorance.

Wednesday, April 17, 2013

Critical Thinking: Particular Causal Reasoning and Arguments from Ignorance

In the previous section we looked at general causal reasoning.  Now we're going kick it up a notch with particular causal reasoning.  Essentially, particular causal reasoning is when we apply causal reasoning to explain a particular effect in terms of a particular (or sometimes general) cause.  For example, I got bronchitis because of the fire down the street.  Contrast this with a general causal claim which might be something like "some people get lung infections from large fires."  

Notice that my particular causal reasoning implicitly relies on a general causal claim.  That is, it is an instance of the general claim.  Bronchitis is an instance of the more general category "lung infections" and the housefire-down-the-street is an instance of the more general category "large fires."  This fact will be important in our evaluation.

The Argument Structure of Particular Causal Reasoning
Recall that general causal reasoning has the structure "X causes Y," where X is all or most objects/events called X and Y is all or most objects/events called Y.  For example, if I say "food motivates dogs to perform tricks" we'd represent this argument as "X causes Y," where X=different types of food and Y=dogs performing tricks.

We can make a particular causal claim too.  I might say "pup-eroni motivates Ottis to sit."  We can represent the particular claim as "this x causes this y," where x=a type of food and y=a particular dog.  (Notice that in the general claim we use uppercase variables and in the particular claim we use lower case.)

So, why does all this alphabet soup matter?  Well, it's not too important but it illustrates that particular causal reasoning depends on more general causal reasoning.  In other words, for every particular causal claim there will be a more general claim upon which it depends.  Also, that the variables in particular claims are instances of more general categories.

Lets take a quick step back:  We know from our general critical thinking that the strength of a conclusion partly relies on the truth of a premise/premises (i.e., premise acceptability).   Consequentially, if we can show that that premise is weak, then the plausibility of the conclusion will be negatively affected.

As applied to particular causal claims, we should recognize that a particular causal claim has as a supporting premise a more general causal claim.  So, if we can show that the general causal claim is weak or strong (using the skills we acquired in the section on general causal reasoning) then this will impact the relative strength of the particular causal claim in the conclusion.

There are a few other ways to evaluate a particular causal claim.  Lets make the argument structure explicit first:

Argument Structure for a Particular Causal Argument
P1:  X causes Y.  (This is the general claim)
P2:  Appealing to the general causal principle in P1 is the best explanation of y.  (y=the particular effect)
C:   Therefore, this x caused this y.

So, we've already said that one aspect of our evaluation will involve evaluating the general claim upon which the particular claim relies.  The second will be regarding (P2):  Is P1 the best explanation of y? (y=the particular effect); that is, is y best explained in terms of the general claim in P1?  Or might there be better explanations?  Here we might appeal to other causal explanations for y.

This is How We Do It....(Note: I've added a few steps of analysis not included in your textbook)
We can illustrate how to evaluate a particular causal claim with an example:  Suppose someone makes the argument after winning at the casino that they won because they were wearing their lucky sweater.  (I've actually heard this).   We might say that the particular claim is an instance of the more general claim that certain objects can bring good luck.  Lets formalize the structure of the argument:

P1  Some objects can bring good luck. (Some Xs cause Y.)
P2  The principle P1 (i.e., some object bring good luck) is the best explanation of why I won at roulette.
C    Therefore, I won at roulette because I was wearing my lucky sweater.

Step 1
How might we evaluate this claim?  First a quick note:  the general claim of which the particular claim is an instance won't always be stated.  Often it will be up to the reader to infer it.  So, lets call that step 1.  In this case, the general causal principle we can infer from the conclusion is the some objects can bring about good luck in gambling (or just generally).

The general causal principle we infer (when not explicitly stated) will require a little bit of subjective judgment on our part to determine scope.  So, as I've indicated, in this example we might say that the general principle is that some objects bring good luck across all activities, or maybe they only bring good luck in gambling.  Here we should apply the principles of charity and reasonableness to avoid constructing strawmen.

Step 2
Decide whether the general causal claim is sound.  That is, apply the evaluative principles we learned in the previous section.  I won't repeat them here but, in this case we'd look at P1 and decide whether it is a good causal principle that some objects can bring about good luck.  We'd also consider questions about representativeness of the sample depending on what information is available to us.

Step 3a
Decide whether the principle in P1 is the best explanation of the effect (y).  Here we ask whether the fact that I had with me a particular "lucky" object (i.e., wore a particular sweater) is the best causal explanation of my good luck at the roulette table.  In other words, if we accept P1 as true--that some objects cause good luck--is the particular effect y (i.e., this instance of winning at roulette) best explained in terms of the implied general causal principle (that some objects bring about good luck)?

We can think of this type of evaluation in terms of comparing competing explanations.  We might ask, are there other plausible explanations for why I won at the roulette table?  If so, are they more likely than the one suggested?  If the competing explanations are more plausible than the one that's implied, this will have consequences on the plausibility of the conclusion.

In this case we might suggest that there is a more plausible general principle for why I won.  Statistically, someone (who places a bet) eventually will win at the roulette table and I just happened to be there at the right time and place.  If I had kept playing, regression toward the mean would have taken over and we'd have seen that my performance with the lucky sweater was no better than the general statistical odds of the game.

Step 3b
Related to the question of whether the general causal principle is the best explanation of the particular event we can ask an "identity" question.  That is, we can ask whether y really is an instance of Y and whether x really is an instance of X.   This step will be very similar to what we do when we evaluate generalizations.  

Lets quickly summarize how we'd go about evaluating a particular causal claim.  (1) The particular claim relies on a more general causal claim, so if the general claim isn't explicitly stated, you should try to infer what it is.  (2)  Decide whether the general causal claim is plausible (using the criteria from the section on general causal reasoning). (3a)  Decide whether the general causal claim (if assumed true) is the best explanation of the particular causal claim.  (3b)  Decide whether the particular cause and effect are indeed instances of the general categories of cause and effect in the general causal claim.

Tuesday, April 16, 2013

Critical Thinking: General Causal Reasoning

Being able to separate correlation from causation is the cornerstone of good science.  Many errors in reasoning can be distilled to this mistake.  Let me preempt this section by saying that making this distinction is by no means a simple matter, and much ink has been spilled over the issue of whether it's even possible in some cases.  However, just because there are some instances where the distinction is indiscernible or difficult to make doesn't mean we should make a (poor) generalization and conclude that all instances are indiscernible or difficult to make.  

We can think of general causal reasoning as a sub-species of generalizations.  For instance, we might say that low-carb diets cause weight loss.  That is to say, diets that are lower in the proportion of carbohydrate calories than other diets will have the effect of weight loss on any individual on that diet.  Of course, we probably can't test every single possible low-carb diet, but give a reasonable sample size we might make this causal generalization. 

A poor causal argument is called the fallacy of confusing causation for correlation or just the causation-correlation fallacy.  Basically this is when we observe that two events occur together either statistically or temporally and so attribute to them a causal relationship.  But just because to events occur together doesn't necessarily imply that there is a causal relationship.

To illustrate:  the rise and fall of milk prices in Uzbekistan closely mirrors the rise and fall of the NYSE (it's a fact!).  But we wouldn't say that the rise and fall of Uzbeki milk prices causes NYSE to rise and fall, nor would we make the claim the other way around.  We might plausibly argue that there is a weak correlation between the NYSE index and the price of milk in Uzbekistan, but it would take quite a bit of work to demonstrate a causal relationship.

Here are a couple of interesting examples:
Strange but true statistical correlations

A more interesting example can be found in the anti-vaccine movement.  This example is an instance of the logical fallacy called "post hoc ergo proptor hoc" (after therefore because of) which is a subspecies of the correlation/causation fallacy.  Just because an event regularly occurs after another doesn't mean that the first event is causing the second.  When I eat, I eat my salad first, then my protein, but my salad doesn't cause me to eat my protein. 

Symptoms of autism become apparent about 6 month after the time a child gets their MMR vaccine.  Because one event occurs after the other, many naturally reason the the prior event is causing the later event.  But as I've explained, just because an event occurs prior to another event doesn't mean it causes it.  

And why pick out one prior event out of the 6 months worth of other prior events?  And why ignore possible genetic and environmental causes?  Or why not say "well, my son got new shoes 6 months ago (prior event) therefore, new shoes cause autism"?  Until you can tease out all the variables, it's a huge stretch to attribute causation just because of temporal order.  

Constant Condition, Variable Condition, and Composite Cause
Ok, we're going to have to introduce a little bit of technical terminology to be able to distinguish between some important concepts.  I don't want to get too caught up in making the distinctions, I'm more concerned about you understanding what they are and (hopefully) the role they play in evaluating causal claims.

A constant condition is a causal factor that must be present if an event is to occur.  Consider combustion.  In order for there to be combustion there must be oxygen present.  But oxygen on its own doesn't cause combustion.  There's oxygen all around us but people aren't continuously bursting into flames.  However, without oxygen there can be no combustion.  In the case of combustion, we would say that oxygen is a constant condition.  That is, it is necessary for the causal event to occur, but it isn't the thing that initiates the causal chain.

When we look at the element or variable that actually initiates a causal chain of events, we call it the variable condition.  In the case of combustion it might be a lit match, a spark from electrical wires, or exploding gunpowder from a gun.  There can be many variable conditions.  

The point is you can't start a fire without a spark.  This gun's for hire.  You can't start a fire without a spark.  Even if we're just dancing in the dark.   Of course, you could also start a fire with several other things.  That's why we call it the variable condition.  But despite all the possible variable conditions, there must be oxygen present...even if we're just dancing in the dark.

As you might expect, when we consider the constant and the variable condition together, we call it the composite cause.   Basically, we are recognizing that for causal events there are some conditions that must be in place across all variable conditions and there are some other conditions that have a direct causal effect but that could be "switched out" with other conditions (like different the sources of a spark).

Separating constant conditions for variable conditions can be useful in establishing policy.  For example, with nutrition  if we know that eating a certain type of diet can cause weight loss (and we want to lose weight) we can vary our diet's composition or quantity of calories (variable conditions) in order to lose weight.  The constant condition (that we will eat) we can't do much about.  

Conversely, we can't control the variable conditions that cause the rain, but by buying umbrellas we can control the constant condition that rain causes us to get wet.  (Water is wet.  That's science!)

The Argument Structure of a General Causal Claim
Someone claims X causes Y.  But how do we evaluate it?  To begin we can use some of the tools we already acquired when we learned how to evaluate generalizations.  To do this we can think of general causal claims as a special case of a generalization (i.e., one about a causal relationship).

I'm sure you all recall that to evaluate a generalization we ask

 (1) is the sample representative?  That is, (a) is it large enough to be statistically significant (b) is it free of bias (i.e., does it incorporate all the relevant sub-groups included in the group you are generalizing about.; 
(2) does X in the sample group really have the property Y (ie., the property of causing event Y to occur).

Once we've moved beyond these general evaluations we can look at specific elements in a general causal claim.  To evaluate the claim we have to look at the implied (but in good science explicit) argument structure that supports the main claim which are actually an expansion of (2) into further aspects of evaluation.  

A general causal claim has 4 implied premises.  Each one serves as an element to scrutanize.

Premise 1:  X is correlated with Y.  This means that there is some sort of relationship between event/object X and event/object Y, but it's too early to say it's causal.   Maybe it's temporal, maybe it's statistical, or maybe it's some other kind of relationship.  

For example, early germ theorist Koch suggested that we can determine if a disease is caused by micro-organisms if those micro-organisms are found on sick bodies and not on healthy bodies.  There was a strong correlation but not a necessary causal relation because for some diseases people can be carriers but immune to the disease.  

In other words, micro-organisms might be a constant condition in a disease causing sickness, but there may be other important variable causes (like environment or genetics) we must consider before we can say the a particular diseases micro-organisms cause sickness.

Premise 2:  The correlation between X and Y is not due to chance.  As we saw with the Uzbek milk prices and the NYSE, sometimes events can occur together but not have a causal relation--the world is full of wacky statistical relations.  Also we are hard-wired to infer causation when one event happens prior to another.  But as you now know, this would be committing the post hoc ergo proptor hoc fallacy.

Premise 3:   The correlation between X and Y is not due to some mutual cause Z.  Suppose someone thinks that "muscle soreness (X) causes muscle growth (Y)."  But this would be mistaken because it's actually exercising the muscle (Z) that causes both events.

In social psychology there was in interesting reinterpretation of a study that demonstrates this principle.  An earlier study showed a strong correlation between overall level of happiness and degree of participation in a religious institution.  The conclusion was that participation in a religious institution causes happiness.  

However, a subsequent study showed that there was a 3rd element (sense of belonging to a close-knit community) that explained the apparent relationship between happiness and religion.  Religious organizations are often close-knit communities so it only appeared as though it was the religious element that cause a higher happiness appraisal.  It turns out that there is a more general explanation of which participation in a religious organization is an instance. 

Premise 4:  Y is not the cause of X.   This issue is often very difficult to disentangle   This is known as trying to figure out the direction of the arrow of causation--and sometimes it can point both ways.  For instance, some people say that drug use causes criminal behaviour.  But in a recent discussion I had with a retired parole officer, he insists that it's the other way around.  He says that youths with a predisposition toward criminal behavior end up taking drugs only after they've entered a life of crime.  I think you could plausibly argue the arrow can point both directions depending on the person or maybe even within the same person (i.e., feedback loop).  There's probably some legitimate research on this matter beyond my musings and the anecdotes of one officer, but this should suffice to illustrate the principle. 

Conclusion:  X causes Y.

Premise 2, 3, and 4 are all about ruling out alternative explanations.  As critical thinkers evaluating or producing a causal argument, we need to seriously consider the plausibility of these alternative explanations.  Recall earlier in the semester we looked briefly at Popperian falsificationism.   We can extend this idea to causation:  i.e., we can never completely confirm a causal relationship, we can only eliminate competing explanations.

With that in mind, the implied premises in a general causal claim provide us a systematic way to evaluate the claim in pieces so we don't overlook anything important.  In other words, when you evaluate a general causal claim, you should do so by laying out the implied structure of the argument for the claim and evaluating them in turn. 

Wednesday, April 10, 2013

Critical Thinking: Assessing Generalizations and Polling

In the course of most arguments, factual or empirical claims will be made.  A factual or empirical claim is one that can actually or in theory be tested.   One common type of empirical claim is a generalization and another, which is a close relative of the former, is polling.  Lets look at each in turn.

A generalization is when an arguer moves from observations about some specific phenomena or objects to a general claim about all phenomena or objects belonging to that group.  For example, I go to McDonald's and order a Big Mac and it costs $1.50.   Then I go to another McDonald's and order another Big Mac and it also costs $1.50.  Based on these observations I generalize to the conclusion that all McDonald's restaurants will change $1.50 for a Big Mac.  (I'm such a great scientist)

Another example would be if I ordered 1000 Tshirts that say "I Love Soviet Uzbekistan".  Upon receiving the order I might look at 5-10 of the shirts in 2 of the 10 boxes to make sure they were printed properly, then from those specific samples I'd generalize to the conclusion that all the Tshirts were printed correctly.

There's nothing really fancy going on with generalizations.  We all do it a lot in our everyday lives because it's practical and often it wouldn't make sense to do otherwise.

General vs Universal Claims
At this point we should make a distinction between a general claim and a universal claim.  A universal claim is that all X's have the property Y.  For example, all humans have a heart and a brain.  Universal claims are much stronger than general claims.  General claims admit of exceptions but are generally true of a set of objects.  For example, generally students like to sleep.  We could possibly point out some counter-examples, like Jittery Joe who doesn't like to sleep.  But, despite the occasional exception, we can accept the generalization as true.

Lets get technical for a moment and formalize these structures:
A universal claim will generally ;) have the form, "all Xs are Y".
A general claim will generally have the form, "Xs are, in general Y," or "Xs are Y," or "Each X is probably Y".

Sometimes (surprisingly) in conversation or in an article the arguer won't spell out for you or use the exact language I've specified here to distinguish between the two types of claims, nevertheless, if you pay attention to context, you should be able to determine which is being made.

The main point to understand is that universal claims don't allow any exceptions whereas general claims do.  Also, from the point of view of constructing and evaluating arguments, it is much more difficult to defend an universal claim than a general claim.

One last type of generalization is the proportional claim:  As you might expect, this type of claim expreses a proportion.   For example, looking through the first 2 boxes of my Tshirts, I notice that 1 out of every 7 is missing the letter "I".  So, even though I don't check the remaining boxes I conclude that 1 out 7 of the Tshirts in those boxes is also missing an "I".  (i.e., I made a proportional generalization).

The thing is, (as you might expect) there are legitimate and illegitimate generalizations which have much to do with the nature and size of the sample from which the generalization is being made.

Sample Size Me!
For obvious reasons, the larger the sample size, the more accurately it will reflect the properties in the entire group of objects.  For instance, if I see one student and she tells me she has student loans, I shouldn't conclude that all students have loans.  Maybe I talk to 3 students and they also tell me they have student loans.  It could be that I just happened to talk to the 3 students that have student loans, it doesn't mean that all students have them.  The sample is still too small for me to legitimately make any inferences about all the students.

Now suppose I talk to 400 students and 100 of them (amazing round numbers!) tell me they have student loans.  At this point I might be able to make a reasonable generalization about students at that particular school or maybe in that particular region or state.

Sample Bias

One worry is that our sample is too small to justify generalizations.  The other is that our sample isn't representative enough of the group about which we are making the generalization.  For instance, if I wanted to make a generalization about the rate of US students with student loans it wouldn't be enough to collect data at only one school.  My sample have to have about the same proportion of sub-groups as does the general population I want to generalize about.  Maybe my sample happens to be from a rich school.  Maybe not.  Either way, this doesn't represent the average school.  Maybe that particular state provides excellent funding, maybe not.  Again, I want to make sure my sample represents the the proportion of states or schools that do and don't provide excellent funding.

What is needed is to take samples from all over the country to have a representative sample of the larger group.  That is, the sample from which we will generalize should be broad enough to negate the clumpiness and should be representative of the group we are trying to generalize about.

In terms of evaluating and constructing arguments beware of anecdotal evidence!  Why?  Remember biases?  Biases have a huge influence over what gets reported and what doesn't.  If we experience something that runs against our bias we tend to ignore it.  While on the other hand, we over emphasize experiences that conform to our biases.  When we are using testimony as evidence (i.e. anecdotal evidence), we should be aware of this and how it increases the likelihood that our sample is biased.

Here are some common sources/common examples of bias: "it worked for me (or my Aunt Martha), therefore it works for everyone".  Another common bias (and a huge issue/problem in social psychology right now) is generalizing about human behavior from samples that have a geographical bias.

 In other words, for decades social psychologists and psychologists have been making generalizations about all of human psychology from samples of US college students.  As it turns out (from recent cross cultural studies) US culture is an outlier in terms of what's "normal" psychology throughout the world.  Yup.  We're the weird ones, not the rest of the world.  Oh! Snap! (but of course our way of being is the right way!)

In medicine, great lengths are gone to to protect against a biased sample.  The gold standard is double-blind, placebo controlled, long-term replicated study that includes different populations and both sexes (i.e., ethnic groups).  The halmark of pseudoscience in medicine is that often these standards are not applied or the sample is too small.

Rules for Good Generalizations
We can think of generalizations as following (implicitly or explicitly) the following argument scheme:
(S=the sample group, X=the group of objects that the generalization will be about, Y=the property we're attributing to Xs)
P1.  S is a sample of Xs.
P2.  The proportion of Ss (that are part of X) that have property Y is Z.
C.    The proportion of Xs that have property Y is Z*.
*see rule 4 below.

Lets use an actual example to get away from the alphabet soup:
P1.  The students in this class are a (representative) sample of UNLV students.
P2.  The proportion of the students in the class that have student loans is 60%.
C.    Therefore, the proportion of UNLV students with loans is around 60%.

P1   10 species of cats is a sample of all the species of all cats.
P2   The proportion of cats in the sample that land on their feet when dropped from over 4 feet is 100%.
C   Therefore, all cats will land on their feet when dropped from over 4 feet.

To evaluate generalizations we essentially want to scrutinize P1 and P2 and their logical connection to C.  To do so we ask if
1) The sample size is reasonable for the scope of the generalization.
2) The sample avoids biases.
3) Objects/Phenomena in the sample (X) do indeed have the property Y.
4)  The proportion of X with property Y in the sample is greater or equal to the claim about the proportion of Xs with property Y in the generalization.   (In other words, I can't say that 30% of Xs in my sample have property Y, yet in generalize that therefore 40% of Xs have property Y.)

If a generalization violates one of these 4 criteria then it likely isn't a defensible generalization.

Famous Polling Fails in US History

Polling is a subset of generalizations so many of the rules for evaluation and analysis will be the same as in the previous section for generalizations.  Polling is a generalization about a specified population's beliefs or attitudes.  For example, during election campagnes, the population in important "battleground" states are usually polled to find out what issues are important to them.  Upon hearing the results, the candidate will then remove what's left of his own spine and say whatever that population wants to hear.  (Meh!  Call me a cynic!)

Suppose I were to conduct a poll of UNLV students to determine their primary motivation for attending university.  To begin the evaluation of the poll we'd need to know 3 things:

(a)  The sample:  Who is in the sample (representativeness) and how big was that sample.
(b)  The population: What is the group I'm trying to make the generalization about.
(c)  The property in question:  What is that belief, attitude, or value I'm trying to attribute to the population.

Recall from the previous section that generalizations can be interpreted as having an (implicit or explicit) argument form.    Lets instantiate this argument structure with a hypothetical poll.  Suppose I want to poll UNLV students with the question, "should critical thinking 102 be a graduation requirement?"  Because I have finite time and energy I can't ask each student at the university.  Instead I'll take a sample and extrapolate from that.  My sample will be students in my class.

P1.  A sample of 36 students from my class is a representative sample of the general student population.
P2.  65% of the students in my class (i.e., the sample) said they agree that critical thinking 102 should be a graduation requirement.
C.  Therefore, we can conclude that around 65% of UNLV students think that critical thinking 102 should be a graduation requirement.

The are 2 broad categories of analysis we can apply to the poll results:

Sampling Errors
Questions about sampling errors apply to P1, which are basically: (a) is the sample size large enough to be representative of the group and (b) does the sample avoid any biases (i.e., does it avoid under or over representing one group over another in a way that isn't reflective of the general population).

Regarding sample size, national polls generally require a (representative) sample size of 1000, so we should expect that a poll about the UNLV population could be quite a bit less than that.  Aside from that, (a) is self explanatory and I've discussed it above, so lets look a little more closely at (b).

The question here is whether the students in my class accurately represent all important subgroups in the student population.  For example, is the sample representative of UNLVs general populations ratio of ethic groups, socio-economic groups, and majors?  You might find that there are other important subgroups that should be captured in a sample depending on the content of the poll.

Someone might plausibly argue that the sample isn't representative because it disproportionately represents students in their 1st and 2nd years.

We can ask a further question about how the group was chosen.  For example, if I make filling out the survey voluntary then there's a possibility of bias.  Why? Because it's possible that people who volunteer for such a survey have a strong opinion one way or another.  This means that the poll will capture only those with strong opinions (or  those who just generally like to give their opinion) but leave out the Joe-Schmo population who might not have strong feelings or might be too busy facebooking on their got-tam phone to bother to do the survey.

In order to protect again such sampling errors polls should engage in random sampling.  That means no matter what sub-group someone is in, they have an equal probability of being selected to do the survey. We can also take things to a whole.  nuva.  level.  when we use stratified sampling.  With stratified sampling we make sure a representative proportion of each subgroup is contained in the general sample.   For example, if I know that about 30% of students are 1st year students then I'll make sure that 30% of my sample randomly samples 1st year students.

Another thing to consider in sampling bias is margin of error.  The margin of error (e.g. +/-5%) measures the likelihood that the data collected is dependable.  Margin of error is important to consider when there is a small difference between competing results.  For example, suppose a survey says 46% of students think Ami should be burned at the stake while 50% say Ami should be hailed as the next messiah.  One might think this clearly shows Ami's well on his way to establishing a new religion but we'd be jumping the gun until we looked at the poll's margin of error.

Suppose the margin of error is +/- 5%.  This means that those that want to burn Ami at the stake could actually be up to 48.3% ((46x.05)+46) and those that want to make him the head of a new religion could be as low as 47.5% ((50x.05)+50).  Ami might have to wait a few more years for world domination.

As I mentioned in the beginning of this section, questions about sampling error are all directed at P1; i.e., is the sample representative of the general population about which the general claim will be made.  Next we will look at measurement errors which have to do with the second premise (i..e., that the people in the sample actually do have the believes/attitudes/properties attributed to them in the survey).

Measurement Errors
Measurement errors have to do with scrutinizing the claim that the sample population actually has the believes/attitudes/properties attributed to them in the survey.  Evaluating polls for measurement errors generally has to do with how the information was asked/collected, how the questions were worded, and the environmental conditions at the time of the poll.

As a starting point, when we are looking at polls that are about political issues, we should generally be skeptical of results--especially when polling agencies that are tied to a political party or ideologies produce competing poll results that conform with their respective positions.  In short, we should be alert to who is conducting the poll and consider whether there may be any biases.

One specific type of measurement error arises out of semantic ambiguity or vagueness. For example, suppose a survey asks if you drive "frequently".  This is a subjective term and could be interpreted differently.  For some people it might mean 1x a week, for others once a day.  A measurement error will be introduced into the data unless this vagueness is cleared up.  Because more people probably think of "frequent drinking" as being "more than what I personally drink", the results will be artificially low.  They also will not very meaningful because the responses don't mean the same thing.

Another type of measurement error arises when we consider the medium by which the questions are asked.  Psychology tells us that people are more likely to tell the truth when asked questions face to face and less so when asked over the phone.  Even less so when asked in groups  (groupthink).  These considerations will introduce measurement errors; that is, they will cast doubt on whether the members of the sample actually have the quality/view/belief being attributed to them.

When evaluation measurement accuracy we should also consider when and where the poll took place.  For example, if, during exam periode, students are asked whether they think school is stressful (generally), probably more will answer in the affirmative than if they are asked during the 1st week of the semester.  

Also, going back to our poll of students concerning the having critical thinking as a graduation requirement, we might argue that the timing is influencing the results.  The sample is taken from students currently taking the class.  Perhaps it's too early in their career to appreciate the course's value; yet if we asked students who had already taken the course and have had a chance to enjoy the glorious fruits of the class, the results might be different.

Finally, we should be alert to how second-hand reporting of polls can present the results in a distorted way.  Newspapers and media outlets want eyeballs, so they might over-emphasize certain aspects of the poll or interpret the results in a way that sensationalizes them.  In short, we should approach with a grain of salt polls that are reported second-hand.

To summarize:  For polling we want to evaluate (1) do the individuals in the sample actually have the values/attitudes/beliefs being attributed to them;  (2) is the sample free of (a) sampling errors and (b) sampling measurement errors.