SPSS Statistics for Data Analysis and Visualization
Buy Rights Online Buy Rights

Rights Contact Login For More Details

More About This Title SPSS Statistics for Data Analysis and Visualization


Dive deeper into SPSS Statistics for more efficient, accurate, and sophisticated data analysis and visualization

SPSS Statistics for Data Analysis and Visualization goes beyond the basics of SPSS Statistics to show you advanced techniques that exploit the full capabilities of SPSS. The authors explain when and why to use each technique, and then walk you through the execution with a pragmatic, nuts and bolts example. Coverage includes extensive, in-depth discussion of advanced statistical techniques, data visualization, predictive analytics, and SPSS programming, including automation and integration with other languages like R and Python. You'll learn the best methods to power through an analysis, with more efficient, elegant, and accurate code.

IBM SPSS Statistics is complex: true mastery requires a deep understanding of statistical theory, the user interface, and programming. Most users don't encounter all of the methods SPSS offers, leaving many little-known modules undiscovered. This book walks you through tools you may have never noticed, and shows you how they can be used to streamline your workflow and enable you to produce more accurate results.

  • Conduct a more efficient and accurate analysis
  • Display complex relationships and create better visualizations
  • Model complex interactions and master predictive analytics
  • Integrate R and Python with SPSS Statistics for more efficient, more powerful code

These "hidden tools" can help you produce charts that simply wouldn't be possible any other way, and the support for other programming languages gives you better options for solving complex problems. If you're ready to take advantage of everything this powerful software package has to offer, SPSS Statistics for Data Analysis and Visualization is the expert-led training you need.


KEITH MCCORMICK is a data mining consultant, trainer, and speaker. A passionate user of SPSS for 25 years, he has trained thousands on how to effectively use SPSS Statistics and SPSS Modeler. He blogs at keithmccormick.com. JESUS SALCEDO is an independent statistical consultant. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users. JON PECK, now retired from IBM, was a senior engineer, statistician, and product strategist for SPSS and IBM for 32 years. He designed and contributed to many features of SPSS Statistics and has consulted with and trained many users. He remains active on social media. ANDREW WHEELER is a researcher in criminal justice and a former crime analyst. He has used SPSS for over 8 years, and often blogs SPSS tutorials at andrewpwheeler.wordpress.com.


Foreword xxiii

Introduction xxvii

Part I Advanced Statistics 1

Chapter 1 Comparing and Contrasting IBM SPSS AMOS with Other Multivariate Techniques 3

T-Test 7



Factor Analysis and Unobserved Variables in SPSS 23


Revisiting Factor Analysis and a General Orientation to AMOS 26

The General Model 29

Chapter 2 Monte Carlo Simulation and IBM SPSS Bootstrapping 43

Monte Carlo Simulation 44

Monte Carlo Simulation in IBM SPSS Statistics 44

Creating an SPSS Model File 45

IBM SPSS Bootstrapping 59

Proportions 63

Bootstrap Mean 66

Bootstrap and Linear Regression 68

Chapter 3 Regression with Categorical Outcome Variables 71

Regression Approaches in SPSS 72

Logistic Regression 73

Ordinal Regression Theory 74

Assumptions of Ordinal Regression Models 77

Ordinal Regression Dialogs 77

Ordinal Regression Output 81

Categorical Regression Theory 86

Assumptions of Categorical Regression Models 87

Categorical Regression Dialogs 87

Categorical Regression Output 93

Chapter 4 Building Hierarchical Linear Models 101

Overview of Hierarchical Linear Mixed Models 102

A Two-Level Hierarchical Linear Model Example 102

Mixed Models…Linear 104

Mixed Models…Linear (Output) 113

Mixed Models…Generalized Linear 116

Mixed Models…Generalized Linear (Output) 120

Adjusting Model Structure 126

Part II Data Visualization 129

Chapter 5 Take Your Data Visualizations to the Next Level 131

Graphics Options in SPSS Statistics 132

Understanding the Revolutionary Approach in The Grammar of Graphics 136

Bar Chart Case Study 138

Bubble Chart Case Study 143

Chapter 6 The Code Behind SPSS Graphics: Graphics Production Language 147

Introducing GPL: Bubble Chart Case Study 147

GPL Help 155

Bubble Chart Case Study Part Two 156

Double Regression Line Case Study 160

Arrows Case Study 163

MBTI Bubble Chart Case Study 167

Chapter 7 Mapping in IBM SPSS Statistics 173

Creating Maps with the Graphboard Template Chooser 174

Creating a Choropleth of Counts Map 175

Creating Other Map Types 179

Creating Maps Using Geographical Coordinates 185

Chapter 8 Geospatial Analytics 193

Geospatial Association Rules 194

Case Study: Crime and 311 Calls 194

Spatio-Temporal Prediction 207

Case Study: Predicting Weekly Shootings 207

Chapter 9 Perceptual Mapping with Correspondence Analysis, GPL, and OMS 217

Starting with Crosstabs 220

Correspondence Analysis 224

Multiple Correspondence Analysis 234

Crosstabulations 234

Applying OMS and GPL to the MCA Perceptual Map 242

Chapter 10 Display Complex Relationships with Multidimensional Scaling 249

Metric and Nonmetric Multidimensional Scaling 251

Nonmetric Scaling of Psychology Sub?]Disciplines 251

Multidimenional Scaling Dialog Options 253

Multidimensional Scaling Output Interpretation 259

Subjective Approach to Dimension Interpretation 264

Statistical Approach to Dimension Interpretation 266

Part III Predictive Analytics 271

Chapter 11 SPSS Statistics versus SPSS Modeler: Can I Be a Data Miner Using SPSS Statistics? 275

What Is Data Mining? 275

What Is IBM SPSS Modeler? 276

Can Data Mining Be Done in SPSS Statistics? 278

Hypothesis Testing, Type I Error, and Hold-Out Validation 280

Significance of the Model and Importance of Each Independent Variable 284

The Importance of Finding and Modeling Interactions 284

Classic and Important Data Mining Tasks 287

Partitioning and Validating 288

Feature Selection 291

Balancing 294

Comparing Results from Multiple Models 295

Creating Ensembles 297

Scoring New Records 300

Chapter 12 IBM SPSS Data Preparation 303

Identify Unusual Cases 304

Identify Unusual Cases Dialogs 305

Identify Unusual Cases Output 311

Optimal Binning 315

Optimal Binning Dialogs 316

Optimal Binning Output 321

Chapter 13 Model Complex Interactions with IBM SPSS Neural Networks 325

Why “Neural” Nets? 326

The Famous Case of Exclusive OR and the Perceptron 328

What Is a Hidden Layer and Why Is It Needed? 332

Neural Net Results with the XOR Variables 333

How the Weights Are Calculated: Error Backpropagation 337

Creating a Consistent Partition in SPSS Statistics 340

Comparing Regression to Neural Net with the Bank Salary Case Study 341

Calculating Mean Absolute Percent Error for Both Models 344

Classification with Neural Nets Demonstrated with the Titanic Dataset 349

Chapter 14 Powerful and Intuitive: IBM SPSS Decision Trees 355

Building a Tree with the CHAID Algorithm 355

Review of the CHAID Algorithm 360

Adjusting the CHAID Settings 363

CRT for Classification 366

Understanding Why the CRT Algorithm Produces a Different Tree 368

Missing Data 369

Changing the CRT Settings 369

Comparing the Results of All Four Models 371

Alternative Validation Options 373

The Scoring Wizard 374

Chapter 15 Find Patterns and Make Predictions with K Nearest Neighbors 379

Using KNN to Find “Neighbors” 380

The Titanic Dataset and KNN Used as a Classifier 381

The Trade-Offs between Bias and Variance 386

Comparing Our Models: Decision Trees, Neural Nets, and KNN 388

Building an Ensemble 391

Part IV Syntax, Data Management, and Programmability 393

Chapter 16 Write More Effi cient and Elegant Code with SPSS Syntax Techniques 395

A Syntax Primer for the Uninitiated 396

Making the Connection: Menus and the Grammar of Syntax 401

What Is “Inefficient” Code? 403

The Case Study 404

Customer Dataset 406

Fixing the ZIP Codes 407

Addressing Case Sensitivity of City Names with UPPER() and LOWER() 409

Parsing Strings and the Index Function 410

Aggregate and Restructure 410

Pasting Variable Names, TO, Recode, and Count 412

DO REPEAT Spend Ratios 414

Merge 415

Final Syntax File 417

Chapter 17 Automate Your Analyses with SPSS Syntax and the Output Management System 421

Overview of the Output Management System 422

Running OMS from Menus 423

Contents xxi

Automatically Writing Selected Categories of Output to Different Formats 424

Suppressing Output 429

Working with OMS data 436

Running OMS from Syntax 438

Chapter 18 Statistical Extension Commands 441

What Is an Extension Command? 441

TURF Analysis—Designing Product Bundles 444

Large Problems 449

Quantile Regression—Predicting Airline Delays 450

Comparing Ordinary Least Squares with Quantile Regression Results 455

Operational Considerations 459

Support Vector Machines—Predicting Loan Default 461

Background 461

An Example 464

Operational Issues 467

Computing Cohen’s d Measure of Effect Size for a T-Test 468

Index 473