Means by group *and* independent samples t-tests! Compare means two ways!
Feb 4, 2025
Using SPSS to look at different averages (means and medians by group) *or* to do independent samples t-tests: a walk though the processes and interpreting the SPSS output.
View Video Transcript
0:00
[Applause]
0:03
yay so today we are running SPSS and we
0:07
are going to start comparing means by
0:10
group so as usual we go to the analyze
0:13
menu we say compare means and in this
0:16
case we're just looking at means we're
0:18
not going to do any statistics yet we're
0:21
just going to compare them so it gives
0:23
us a dependent list and an independent
0:26
list and the dependent list is the
0:29
things that we want to get means on so
0:31
let's say the number of bedrooms the
0:33
number of bathrooms and the sale price 0
0:38
and days on Market why not do everything
0:41
and we're going to do this for each
0:43
house type okay so we're going to go
0:46
into options just because we can we're
0:48
going to look at this the mean number of
0:51
cases standard deviation and just for
0:54
fun the median let's move this around so
0:57
I'm holding down the mouse button and
0:59
drag in so I can change the order that
1:02
these things come out
1:04
as I say continue I say okay and here it
1:09
is we get our case processing summary
1:11
which as usual we're going to ignore and
1:14
we get our full report so let's make
1:16
this a little bigger so it's important
1:19
to look at the number of cases that
1:22
you're that are in each group so for
1:24
example here we have the by Lev houses
1:27
the sample size is always n and for
1:30
number of things and we see that there's
1:33
just one house on the market which has
1:37
beds baths price and days on Market
1:39
information but there's just one so
1:41
we're going to ignore this because one
1:43
case of anything is kind of meaningless
1:45
so then we're going to look at Ranch
1:46
houses and we've got three Ranch houses
1:48
but that's really still not enough for
1:50
us to really think is going to tell us
1:53
anything serious we would like to see
1:56
preferably 30 but at least say 10 10 or
2:00
20 things uh 10 or 20 houses before we
2:03
can start feeling justified in making
2:05
comparisons so what we've really got
2:07
here to look at is Cape cods the
2:10
Colonials maybe the split levels and we
2:13
can look at here we see that the cape
2:16
cods they have an average of 3.6
2:18
bedrooms 1.8 bathrooms and the selling
2:22
price was an average of about
2:26
$326,000 and they spent an average of
2:28
around 85 days on the market maret if
2:30
you have a colonial you're probably
2:31
better off because if you take a look
2:33
here there's actually fewer bedrooms the
2:37
same number of
2:38
bathrooms and the price is considerably
2:42
higher they get about
2:44
$468,000 for these compared to
2:47
$326,000 for the cape cods on the
2:51
downside they spend 99 days on the
2:53
market versus around 85 days but that
2:56
might be because they've got the higher
2:57
prices the medians are lower more than
3:00
the means and that's true for all of
3:02
these groups and the reason for that is
3:06
there's a certain minimum price that a
3:08
house will sell for the minimum price
3:10
for pretty much any house in tenek is
3:12
going to be around 100 maybe
3:15
$200,000 whereas there's no upper
3:17
boundary on the price for selling a
3:20
house so the problem with using averages
3:23
is that they have a tendency to get
3:24
thrown off by extremes at one end or the
3:28
other of the Curve so when you have
3:31
houses that are worth $2 million where
3:33
most of them are selling for say $
3:35
360,000 the average will be way higher
3:39
than it should be whereas the median is
3:42
a bit more stable because you're just
3:44
lining up all the numbers and picking
3:45
the one in the middle so if the ones at
3:48
the top are $5 and6 million and all the
3:50
other ones are 300,000 the median will
3:53
be
3:53
$300,000 but the average could be
3:56
400,000 so there's times when an average
3:59
make makes more sense and there's times
4:01
when a median makes more sense so if
4:03
you're looking at this and you're saying
4:05
well if I have a four bedroom two B Cape
4:09
Cod how much can I expect to sell it for
4:12
then
4:13
$319,000 doeses seem indicated here all
4:16
else being equal so here we've got the
4:19
mean for each group and here group is
4:23
based on what type of house it is so one
4:26
group is Cape cods one group is
4:27
Colonials one group is ranches is one
4:30
group of split levels so you've got the
4:32
mean for each of these variables so the
4:35
mean for beds the mean for baths the
4:37
mean for Price the mean for days on
4:39
market then you've got the median for
4:41
each of these and again but since you
4:43
can't be on the market for Less Than
4:45
Zero days then you can't get lower than
4:49
one or zero but you can get as high as 7
4:53
or 800 or even a th000 because a house
4:56
can just stay on the market for a couple
4:58
years so the median will be lower see
5:01
then you've got the number of cases if
5:03
you run it slightly differently you can
5:06
make it a little more Compact and you
5:07
can also do clever things here by double
5:11
clicking you get what's called pivot
5:13
trays so you can move your statistics up
5:16
here into
5:19
variables and now you might say it's
5:21
easier to compare means because you can
5:23
see them all right next to each other of
5:26
course you still have to go down to P to
5:29
the sample size to see which ones are
5:31
actually real so again to do that you
5:35
just doubleclick and then you can move
5:37
these things around to your heart's
5:39
content and it will reflect whatever
5:42
you're doing whether it makes sense or
5:47
not so there's all sorts of different
5:49
ways to look at this you're just uh
5:51
changing the way that the table is set
5:53
up and then you get out of it the usual
5:57
way and this is crossplatform it works
5:59
works the same way if you're using a Mac
6:01
as if you're using Windows so that was
6:04
fun but now we're going to get rid of it
6:06
so the two types that we can really
6:08
compare the most easily are the cape
6:10
cods and the Colonials because we have
6:12
29 of one and 159 of the others and I
6:15
have a theory that Colonials sell for a
6:17
significantly higher price than the cap
6:20
cods in other words that the difference
6:23
in price that I saw before is not just
6:25
due to random chance it's actually
6:27
something that exists in real life into
6:29
Variable View here's style and the ones
6:33
that we're looking for are cape cods and
6:35
Colonials 2 and three compare means
6:39
independent samples T Test I say
6:42
style two and
6:47
three and now I get this lovely um
6:50
output so what do I do let's look at
6:54
group statistics first so group
6:57
statistics is looking at each of these
6:59
variables and it says okay for Price
7:01
there were 29 reports of price for Cape
7:04
cods and 159 for Colonials here's the
7:08
mean selling price for Cape cods
7:16
326,885
7:17
196 about then we have the standard
7:21
deviation and look at that whopping rate
7:23
standard deviation it is
7:28
$213,800 so you know that there's a lot
7:30
of variability going on so there's
7:33
probably some very expensive Colonials
7:35
mixed in with the others now let's move
7:37
to the independent samples T Test and
7:40
this is a little weird okay we're
7:43
looking at whether there's a difference
7:45
between the two groups but there's two
7:48
different formulas that we can use and
7:51
one of them is much more complicated
7:53
than the other so you've got your two
7:56
different formulas and the computer is
7:58
not saying I'm going to to choose the
8:00
correct formula it's saying I'm going to
8:02
give you the information and do it both
8:04
ways so that you can decide whether to
8:07
assume equal variances or not now
8:09
typically what we do is we assume that
8:13
you have equal variances if the
8:16
significance of this F test here is
8:20
above
8:22
.005 so what that means is that if the
8:25
significance is at 0.05 or less it means
8:29
that there's a 5% or less chance that we
8:33
are seeing differences in the variance
8:36
due to chance so essentially if we have
8:40
a low significance here 05 or below then
8:43
we do not assume equal variances so we
8:46
go to the second line if significance is
8:49
above 005 then we assume equal variances
8:52
and we use the top line now I will admit
8:55
I generally don't look at this right
8:57
away first I look over here here this is
9:01
the significance for the T test for
9:03
equality of means so this table is
9:06
basically two tables in one so here's
9:08
one table here around lavine's F test
9:11
for equality of variances I know it's an
9:13
F test because it has the letter F over
9:15
there and then there's this test over
9:18
here which is the T test for equality of
9:20
means and that's everything to the right
9:23
of this line starting with t going on to
9:26
DF which means degrees of and all the
9:28
others mean differ by the way let's just
9:30
clarify one thing here mean difference
9:33
sounds more complicated than it is it's
9:36
just one mean minus the other so it's
9:38
468 396 subtracted from 326
9:43
470 so if I type this in so if I take
9:48
the mean 326 470 69 minus 468
9:55
39623
9:57
equal 141 925
10:00
54 and there it is there different by
10:03
.03 because they give you an extra
10:05
decimal point here than they do here
10:08
mean difference is just one mean minus
10:10
the other so for here we would expect it
10:12
to be 02 because 3.62 minus 3.42 equal 2
10:17
and here it is
10:19
rounded 2 again they kind of mess things
10:23
up by having an extra decimal place here
10:25
let's go back to what we are trying to
10:27
do we are trying to say is there
10:28
significant difference between price and
10:30
number of bedrooms between cape cods and
10:33
Colonials and here we see that price
10:36
there is a definitely a difference or to
10:39
be more scientific there is less than
10:41
one chance in 1,000 that we would see
10:44
such a large difference between these
10:46
two given these standard deviations and
10:49
given the number in the sample there's
10:51
there's less than a one in a, chance
10:53
that we would see this huge difference
10:55
just due to Randomness and chance and
10:58
weird factors so that's a pretty big uh
11:01
significance level and we can say all
11:04
right this is a big difference in price
11:06
we can see that with our eyes it's
11:08
141,000
11:10
$142,000 about so we're going to say
11:14
okay there is a significant difference
11:16
between these two the official way that
11:18
we're supposed to say that is that we
11:19
have to reject the null hypothesis that
11:22
there's no difference so for number of
11:24
bedrooms though it's a little muddier
11:26
here you notice that there's very little
11:28
difference 3.62 versus 3.42 bedrooms and
11:32
their standard deviations are not that
11:35
high first we look at what line to look
11:37
at we cannot assume equal variances
11:39
because this is significant which means
11:43
that there is a difference between the
11:44
variances so we go to the second line
11:46
and we come down here and we see aha
11:51
093 is the significance you got to
11:55
reject that most of the time if I'm
11:57
doing one or two tests I will set my
11:59
Criterion at 0.05 that means that five
12:03
times out of 100 which is what 0.05
12:05
means or 5% of the time which is also
12:09
what 5.05 means I will see a quote
12:13
significant difference unquote just by
12:16
random chance so five times in 100 or
12:20
one time in 20 if I run 20 analyses and
12:24
I use 0.05 as my Criterion in theory one
12:28
of them will probably be a false
12:30
positive now the thing is that if you're
12:32
doing a lot of these tests then you
12:34
should use a lower level of p p being
12:38
the probability of getting a false
12:40
positive so uh what some people do is
12:43
they just go down to 0.01 as their P so
12:46
they won't take anything that's not less
12:48
than .001 and other people use What's
12:51
called the bonon technique and that
12:54
means that you take your p and you
12:57
divide it by the number of analysis that
12:59
you're doing which in a casee like this
13:01
would mean that we have P = 05 we divide
13:04
it by 2 we get p =
13:06
025 but um if you're doing 100 analyses
13:09
it gets a little tough you look at the F
13:11
test if the F test is significant then
13:14
you use this second line here and you
13:16
look at the significance for the T Test
13:19
or you look at the significance for the
13:22
T Test and you say well I'd reject this
13:24
no matter what and I'd accept this no
13:26
matter what so it doesn't really matter
13:28
what the f test says unless these two
13:31
lines are saying different things so if
13:33
the significance was 0.02 for one of
13:36
them and2 for another then you'd have to
13:39
go through the work of doing the F test
13:41
so that's basically the T Test and the T
13:44
Test is a very very handy thing to know
13:47
it's very often handy to compare two
13:49
groups against each other
#Property Inspections & Appraisals
#Real Estate
#Real Estate Listings
#Real Estate Services
#Residential Rentals
#Residential Sales