XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
Buy Rights Online Buy Rights

Rights Contact Login For More Details

More About This Title XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition


This book is primarily a practical reference book for professional XSLT developers. It assumes no previous knowledge of the language, and many developers have used it as their first introduction to XSLT; however, it is not structured as a tutorial, and there are other books on XSLT that provide a gentler approach for beginners.

The book does assume a basic knowledge of XML, HTML, and the architecture of the Web, and it is written for experienced programmers. There’s no assumption that you know any particular language such as Java or Visual Basic, just that you recognize the concepts that all programming languages have in common.

The book is suitable both for XSLT 1.0 users upgrading to XSLT 2.0, and for newcomers to XSLT. The book is also equally suitable whether you work in the Java or .NET world.

As befits a reference book, a key aim is that the coverage should be comprehensive and authoritative. It is designed to give you all the details, not just an overview of the 20 percent of the language that most people use 80 percent of the time. It’s designed so that you will keep coming back to the book whenever you encounter new and challenging programming tasks, not as a book that you skim quickly and then leave on the shelf. If you like detail, you will enjoy this book; if not, you probably won’t.

But as well as giving the detail, this book aims to explain the concepts, in some depth. It’s therefore a book for people who not only want to use the language but who also want to understand it at a deep level.

The book aims to tell you everything you need to know about the XSLT 2.0 language. It gives equal weight to the things that are new in XSLT 2.0 and the things that were already present in version 1.0. The book is about the language, not about specific products. However, there are appendices about Saxon (the author’s own implementation of XSLT 2.0), about the Altova XSLT 2.0 implementation, and about the Java and Microsoft APIs for controlling XSLT transformations, which will no doubt be upgraded to handle XSLT 2.0 as well as 1.0. A third XSLT 2.0 processor, Gestalt, was released shortly before the book went to press, too late to describe it in any detail. But the experience of XSLT 1.0 is that there has been a very high level of interoperability between different XSLT processors, and if you can use one of them, then you can use them all.

In the previous edition we split XSLT 2.0 and XPath 2.0 into separate volumes. The idea was that some readers might be interested in XPath alone. However, many bought the XSLT 2.0 book without its XPath companion and were left confused as a result; so this time, the material is back together. The XPath reference information is in self-contained chapters, so it should still be accessible when you use XPath in contexts other than XSLT.

The book does not cover XSL Formatting Objects, a big subject in its own right. Nor does it cover XML Schemas in any detail. If you want to use these important technologies in conjunction with XSLT, there are other books that do them justice.

This book contains twenty chapters and eight appendixes (the last of which is a glossary) organized into four parts. The following section outlines what you can find in each part, chapter, and appendix.

Part I: Foundations: The first part of the book covers essential concepts. You should read these before you start coding. If you ignore this advice, as most people do, then you read them when you get to that trough of despair when you find it impossible to make the language do anything but the most trivial tasks. XSLT is different from other languages, and to make it work for you, you need to understand how it was designed to be used.

Chapter 1: XSLT in Context: This chapter explains how XSLT fits into the big picture: how the language came into being and how it sits alongside other technologies. It also has a few simple coding examples to keep you alert.

Chapter 2: The XSLT Processing Model: This is about the architecture of an XSLT processor: the inputs, the outputs, and the data model. Understanding the data model is perhaps the most important thing that distinguishes an XSLT expert from an amateur; it may seem like information that you can’t use immediately, but it’s knowledge that will stop you making a lot of stupid mistakes.

Chapter 3: Stylesheet Structure: XSLT development is about writing stylesheets, and this chapter takes a bird’s eye view of what stylesheets look like. It explains the key concepts of rule-based programming using templates, and explains how to undertake programming-in-the-large by structuring your application using modules and pipelines.

Chapter 4: Stylesheets and Schemas: A key innovation in XSLT 2.0 is that stylesheets can take advantage of knowledge about the structure of your input and output documents, provided in the form of an XML Schema. This chapter provides a quick overview of XML Schema to describe its impact on XSLT development. Not everyone uses schemas, and you can skip this chapter if you fall into that category.

Chapter 5: The Type System: XPath 2.0 and XSLT 2.0 offer strong typing as an alternative to the weak typing approach of the 1.0 languages. This means that you can declare the types of your variables, functions, and parameters, and use this information to get early warning of programming errors. This chapter explains the data types available and the mechanisms for creating user-defined types.

Part II: XSLT and XPath Reference: This section of the book contains reference material, organized in the hope that you can easily find what you need when you need it. It’s not designed for sequential reading, though you might well want to leaf through the pages to discover what’s there.

Chapter 6: XSLT Elements: This monster chapter lists all the XSLT elements you can use in a stylesheet, in alphabetical order, giving detailed rules for the syntax and semantics of each element, advice on usage, and examples. This is probably the part of the book you will use most frequently as you become an expert XSLT user. It’s a “no stone unturned” approach, based on the belief that as a professional developer you need to know what happens when the going gets tough, not just when the wind is in your direction.

Chapter 7: XPath Fundamentals: This chapter explains the basics of XPath: the low-level constructs such as literals, variables, and function calls. It also explains the context rules, which describe how the evaluation of XPath expressions depends on the XSLT processing context in which they appear.

Chapter 8: XPath: Operators on Items: XPath offers the usual range of operators for performing arithmetic, boolean comparison, and the like. However, these don’t always behave exactly as you would expect, so it’s worth reading this chapter to see what’s available and how it differs from the last language that you used.

Chapter 9: XPath: Path Expressions: Path expressions are what make XPath special; they enable you to navigate around the structure of an XML document. This chapter explains the syntax of path expressions, the 13 axes that you can use to locate the nodes that you need, and associated operators such as union, intersection, and difference.

Chapter 10: XPath: Sequence Expressions: Unlike XPath 1.0, in version 2.0 all values are sequences (singletons are just a special case). Some of the most important operators in XPath 2.0 are those that manipulate sequences, notably the «for» expression, which translates one sequence into another by applying a mapping.

Chapter 11: XPath: Type Expressions: The type system was explained in Chapter 5; this chapter explains the operations that you can use to take advantage of types. This includes the «cast» operation which is used to convert values from one type to another.A big part of this chapter is devoted to the detailed rules for how these conversions are done.

Chapter 12: XSLT Patterns: This chapter returns from XPath to a subject that’s specific to XSLT. Patterns are used to define template rules, the essence of XSLT’s rule-based programming approach. The reason for explaining them now is that the syntax and semantics of patterns depends strongly on the corresponding rules for XPath expressions.

Chapter 13: The Function Library: XPath 2.0 includes a library of functions that can be called from any XPath expression; XSLT 2.0 extends this with some additional functions that are available only when XPath is used within XSLT. The library has grown immensely since XPath 1.0. This chapter provides a single alphabetical reference for all these functions.

Chapter 14: Regular Expressions: Processing of text is an area where XSLT 2.0 and XPath 2.0 are much more powerful than version 1.0, and this is largely through the use of constructs that exploit regular expressions. If you’re familiar with regexes from languages such as Perl, this chapter tells you how XPath regular expressions differ. If you’re new to the subject, it explains it from first principles.

Chapter 15: Serialization: Serialization in XSLT means the ability to generate a textual XML document from the tree structure that’s manipulated by a stylesheet. This isn’t part of XSLT processing proper, so (following W3C’s lead) it’s separated it into its own chapter. You can control serialization from the stylesheet using an declaration, but many products also allow you to control it directly via an API.

Part III: Exploitation: The final section of the book is advice and guidance on how to take advantage of XSLT to write real applications. It’s intended to make you not just a competent XSLT coder, but a competent designer too. The best way of learning is by studying the work of others, so the emphasis here is on practical case studies.

Chapter 16: Extensibility: This chapter describes the “hooks” provided in the XSLT specification to allow vendors and users to plug in extra functionality. The way this works will vary from one implementation to another, so we can’t cover all possibilities, but one important aspect that the chapter does cover is how to use such extensions and still keep your code portable.

Chapter 17: Stylesheet Design Patterns: This chapter explores a number of design and coding patterns for XSLT programming, starting with the simplest “fill-in-the-blanks” stylesheet, and extending to the full use of recursive programming in the functional programming style, which is needed to tackle problems of any computational complexity. This provides an opportunity to explain the thinking behind functional programming and the change in mindset needed to take full advantage of this style of development.

Chapter 18: Case Study: XMLSpec: XSLT is often used for rendering documents, so where better to look for a case study than the stylesheets used by the W3C to render the XML and XSLT specifications, and others in the same family, for display on the web? The resulting stylesheets are typical of those you will find in any publishing organization that uses XML to develop a series of documents with a compatible look-and-feel.

Chapter 19: Case Study: A Family Tree: Displaying a family tree is another typical XSLT application. This example with semi-structured data—a mixture of fairly complex data and narrative text—that can be presented in many different ways for different audiences. It also shows how to tackle another typical XSLT problem, conversion of the data into XML from a legacy text-based format. As it happens, this uses nearly all the important new XSLT 2.0 features in one short stylesheet. But another aim of this chapter is to show a collection of stylesheets doing different jobs as part of a complete application.

Chapter 20: Case Study: Knight's Tour: Finding a route around a chessboard where a knight visits every square without ever retracing its steps might sound a fairly esoteric application for XSLT, but it’s a good way of showing how even the most complex of algorithms are within the capabilities of the language. You may not need to tackle this particular problem, but if you want to construct an SVG diagram showing progress against your project plan, then the problems won’t be that dissimilar.

Part IV: Appendices: Appendix A: XPath 2.0 Syntax Summary: Collects the XPath grammar rules and operator precedences into one place for ease of reference.

Appendix B: Error Codes: A list of all the error codes defined in the XSLT and XPath language specifications, with brief explanations to help you understand what’s gone wrong.

Appendix C: Backward Compatibility: The list of things you need to look out for when converting applications from XSLT 1.0.

Appendix D: Microsoft XSLT Processors: Although the two Microsoft XSLT processors don’t yet support XSLT 2.0, we thought many readers would find it useful to have a quick summary here of the main objects and methods used in their APIs.

Appendix E: JAXP: the Java API for XML Processing: JAXP is an interface rather than a product. Again, it doesn’t have explicit support yet for XSLT 2.0, but Java programmers will often be using it in XSLT 2.0 projects, so the book includes an overview of the classes and methods available.

Appendix F: Saxon: At the time of writing Saxon (developed by the author of this book) provides the most comprehensive implementation of XSLT 2.0 and XPath 2.0, so its interfaces and extensions are covered in some detail.

Appendix G: Altova: Altova, the developers of XML Spy, have an XSLT 2.0 processor that can be used either as part of the development environment or as a freestanding component. This appendix gives details of its interfaces.

Appendix H: Glossary

Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.


Michael Kay has been working in the XML field since 1997; he became a member of the XSL Working Group soon after the publication of XSLT 1.0, and took over as editor of the XSLT 2.0 specification in early 2001. He is also a member of the XQuery and XML Schema Working Groups, and is a joint editor of the XPath 2.0 specification. He is well known not only through previous editions of this book but also as the developer of the open source Saxon product, a pioneering implementation of XSLT 2.0, XPath 2.0, and XQuery 1.0.
In 2004 the author formed his own company, Saxonica, to provide commercial software and services building on the success of the Saxon technology. Previously, he spent three years with Software AG, working with the developers of the Tamino XML server, an early XQuery implementation. His background is in database technology: after leaving the University of Cambridge with a Ph.D., he worked for many years with the (then) computer manufacturer ICL, developing network, relational, and objectoriented database software products as well as a text search engine, and held the position of ICL Fellow.


Introduction xxix

List of Examples xxxix

Part I: Foundations

Chapter 1: XSLT in Context 3

What Is XSLT? 3

How Does XSLT Transform XML? 7

The Place of XSLT in the XML Family 21

The History of XSL 26

XSLT2.0asa Language 33

Summary 40

Chapter 2: The XSLT Processing Model 41

XSLT: A System Overview 41

The XDM Tree Model 45

The Transformation Process 67

Error Handling 80

Variables and Expressions 80

Summary 88

Chapter 3: Stylesheet Structure 89

Changes in XSLT 2.0 90

The Modular Structure of a Stylesheet 90

The <xsl:stylesheet>Element 98

The <?xml-stylesheet?>Processing Instruction 99

Embedded Stylesheets 102

Declarations 104

Instructions 108

Simplified Stylesheets 125

Writing Portable Stylesheets 127

Whitespace 141

Summary 148

Chapter 4: Stylesheets and Schemas 151

XML Schema: An Overview 151

Declaring Types in XSLT 161

Validating the Source Document 165

Validating the Result Document 170

Validating a Temporary Document 174

Validating Individual Elements 176

Validating Individual Attributes 179

The default-validation Attribute 180

Importing Schemas 180

Using xsi: type 181

Nillability 182

Summary 183

Chapter 5: Types 185

What Is a Type System? 185

Changes in 2.0 186

Sequences 187

Atomic Values 189

Atomic Types 191

Schema Types and XPath Types 217

The Type Matching Rules 219

Static and Dynamic Type Checking 221

Summary 224

Part II: XSLT and XPath Reference

Chapter 6: XSLT Elements 227

Summary 519

Chapter 7: XPath Fundamentals 521

Notation 522

Where to Start 523

Expressions 524

Lexical Constructs 527

Primary Expressions 539

Variable References 540

Parenthesized Expressions 542

Context Item Expressions 543

Function Calls 544

Conditional Expressions 551

The XPath Evaluation Context 553

Summary 568

Chapter 8: XPath: Operators on Items 571

Arithmetic Operators 571

Value Comparisons 581

General Comparisons 588

Node Comparisons 593

Boolean Expressions 594

Summary 596

Chapter 9: XPath: Path Expressions 599

Examples of Path Expressions 600

Changes in XPath 2.0 601

Document Order and Duplicates 602

The Binary «⁄»Operator 602

Axis Steps 606

Rooted Path Expressions 625

The«⁄ ⁄»Abbreviation 626

Combining Sets of Nodes 628

Summary 632

Chapter 10: XPath: Sequence Expressions 633

The Comma Operator 634

Numeric Ranges: The «to» Operator 636

Filter Expressions 638

The «for» Expression 640

Simple Mapping Expressions 644

The «some» and «every» Expressions 646

Summary 651

Chapter 11: XPath: Type Expressions 653

Converting Atomic Values 654

Sequence Type Descriptors 668

The «instance of» Operator 677

The «treat as» Operator 678

Summary 680

Chapter 12: XSLT Patterns 681

Patterns and Expressions 681

Changes in XSLT 2.0 682

The Formal Definition 683

An Informal Definition 685

Conflict Resolution 686

Matching Parentless Nodes 688

The Syntax of Patterns 689

Summary 708

Chapter 13: The Function Library 709

A Word about Naming 710

Functions by Category 710

Notation 712

Code Samples 714

Function Definitions 714

Summary 913

Chapter 14: Regular Expressions 915

Branches and Pieces 916

Quantifiers 916

Atoms 917

Subexpressions 918

Back-References 918

Character Groups 919

Character Ranges 919

Character Class Escapes 920

Character Blocks 922

Character Categories 924

Flags 925

Disallowed Constructs 927

Summary 927

Chapter 15: Serialization 929

The XML Output Method 929

The HTML Output Method 936

The XHTML Output Method 939

The Text Output Method 940

Using the <xsl:output> declaration 940

Character Maps 941

Disable Output Escaping 945

Summary 949

Part III: Exploitation

Chapter 16: Extensibility 953

What Vendor Extensions Are Allowed? 954

Extension Functions 955

Keeping Extensions Portable 970

Summary 971

Chapter 17: Stylesheet Design Patterns 973

Fill-in-the-Blanks Stylesheets 973

Navigational Stylesheets 976

Rule-Based Stylesheets 980

Computational Stylesheets 985

Summary 1000

Chapter 18: Case Study: XMLSpec 1001

Formatting the XML Specification 1002

Preface 1004

Creating the HTML Outline 1008

Formatting the Document Header 1012

Creating the Table of Contents 1019

Creating Section Headers 1023

Formatting the Text 1024

Producing Lists 1028

Making Cross-References 1029

Setting Out the Production Rules 1033

Overlay Stylesheets 1041

Stylesheets for Other Specifications 1044

Summary 1047

Chapter 19: Case Study: A Family Tree 1049

Modeling a Family Tree 1050

Creating a Data File 1058

Displaying the Family Tree Data 1072

Summary 1098

Chapter 20: Case Study: Knight’s Tour 1099

The Problem 1099

The Algorithm 1100

Placing the Knight 1104

Displaying the Final Board 1105

Finding the Route 1106

Running the Stylesheet 1112

Observations 1112

Summary 1113

Part IV: Appendices

Appendix A: XPath 2.0 Syntax Summary 1117

Whitespace and Comments 1118

Tokens 1118

Syntax Productions 1119

Operator Precedence 1122

Appendix B: Error Codes 1123

Functions and Operators (FO) 1124

XPath Errors (XP) 1126

XSLT Errors (XT) 1127

Appendix C: Backward Compatibility 1139

Stage 1: Backward-compatibility Mode 1140

Stage2: Setting version=‘‘2.0’’ 1142

Stage 3: Adding a Schema 1145

Summary 1145

Appendix D: Microsoft XSLT Processors 1147

MSXML 1147

System.Xml 1158

Summary 1161

Appendix E: JAXP: The Java API for Transformation 1163

The JAXP Parser API 1164

The JAXP Transformation API 1169

Examples of JAXP Transformations 1187

Summary 1193

Appendix F: Saxon 1195

Using Saxon from the Command Line 1196

Using Saxon from a Java Application 1199

Using Saxon from a .NET Application 1203

Saxon Tree Models 1205

Extensibility 1205

Extensions 1208

The evaluate() Extension 1210

Summary 1214

Appendix G: Altova 1215

Running from within XMLSpy 1215

Conformance 1216

Extensions and Extensibility 1217

The Command Line Interface 1217

Using the API 1218

Summary 1220

Appendix H: Glossary 1221

Index 1233