Introducing Comega
January 12, 2005
Introduction
One of the main reasons for XML's rise to prominence as the lingua franca for information interchange is that, unlike prior data interchange formats, XML can easily represent both rigidly structured tabular data (e.g., relational data or serialized objects) and semi-structured data (e.g., office documents). The former tends to be strongly typed and is typically processed using object⇔XML mapping technologies, while the latter tends to be untyped and is usually processed using XML technologies like DOM, SAX, and XSLT. However in both cases, there is a disconnect for developers processing XML using traditional object-oriented programming languages.
In the case of processing strongly typed XML using object⇔XML mapping technologies, there is the impedance mismatch between programming language objects and XML schema languages like DTDs or W3C XML Schema. Notions such as the distinction between elements and attributes, document order, and content models that specify a choice of elements are all intrinsic to XML schema languages but have no counterpart in standard object-oriented programming. These mismatches tend to lead to some contortions and lossiness when mapping XML to objects. Processing untyped XML documents using technologies such as XSLT or the DOM has a different set of issues. In the case of XSLT, or other XML-specific languages like XQuery, the developer has to learn a whole other language to effectively process XML as well as their programming language of choice. Often, all the benefits of the IDE of the host language--for example, compile time checking and tooltips--cannot be utilized when processing XML. In the case of processing XML with DOM or SAX, developers often complain that the code they have to write tends to become unwieldy and cumbersome.
As design patterns and APIs for performing particular tasks become widely used, they sometimes become incorporated into programming languages. Programming languages like Java and C# have promoted concepts which exist as design patterns and API calls in other languages, such as native string types, memory management via garbage collection, and event handling to core constructs within the language. This evolutionary process has now begun to involve XML. As XML has grown in popularity, certain parties have begun to integrate constructs for creating and manipulating XML into mainstream programming languages. The expectation is that making XML processing a native part of these programming languages will ease some of the problems facing developers that use traditional XML processing techniques.
Two of the most notable, recent examples of XML being integrated into programming languages are Cω (C-Omega)--an extension of C# produced by Microsoft Research--and ECMAScript for XML (E4X)--an ECMAScript extension produced by ECMA International. This article provides an overview of the XML features of Cω; my next article will explore the E4X language. This article begins with an introduction to the changes to the C# type system made in Cω, followed by a look at the operators added to the C# language to enable easier processing of relational and XML data.
The Cω Type System
The goal of the Cω type system is to bridge the gap between relational, object, and XML data access by creating a type system that is a combination of all three data models. Instead of adding built-in XML or relational types to the C# language, the approach displayed by the Cω type system has been to add certain general changes to the C# type system that make it more conducive for programming against both structured relational data and semi-structured XML data.
A number of the changes to C# made in Cω make it more conducive for programming against strongly typed XML, specifically XML constrained by W3C XML Schema. Several concepts from XML and XML Schema have analogous features in Cω: document order, the distinction between elements and attributes, having multiple fields with the same name but different values, and content models that specify a choice of types for a given field. A number of these concepts are handled in traditional object⇔XML mapping technologies but it is often with awkwardness. Cω makes programming against strongly typed XML as natural as programming against arrays or strings in traditional programming languages.
Streams
Streams in Cω are analogous to sequences in XQuery and XPath 2.0 and .NET 2.0's System.Collections.Generic.IEnumerable<T> type. With the existence of streams, Cω promotes the concept of an ordered, homogenous collection of zero or more items into a programming language construct. Streams are a fundamental aspect of the Cω type system upon which a number of the other extensions to C# rest.
A stream is declared by appending the operator *
to the type name in the
declaration of the variable. Typically, streams are generated using iterator functions.
An
iterator function is a function that returns an ordered sequence of values by using
a
yield
statement to return each value in turn. When a value is yielded, the
state of the iterator function is preserved and the caller is allowed to execute.
The next
time the iterator is invoked, it continues from the previous state and yields the
next
value. Iterators functions in Cω work like the iterators
functions planned for C# 2.0 The most obvious difference between iterator functions in
Cω and iterator functions in C# is that the former return a stream (T*
)
while C# 2.0 iterators return an enumerator (IEnumerator<T>
). However,
there is little difference in behavior when interacting with a stream or an enumerator.
The
more significant difference is that, just like sequences in XQuery, streams in Cω
cannot contain other streams. When multiple streams are combined, the results are
flattened
into a single stream. For example, appending the stream (4,5) to the stream (1,2,3)
results
in a stream containing five items (1,2,3,4,5), not a stream with four items (1, 2,
3, (4,
5)). In C# 2.0, it isn't possible to combine multiple enumerators in such a manner,
although
it is possible to create an enumerator of enumerators.
The following iterator function returns a stream containing the three books in the Lord of the Rings Trilogy.
public string* LoTR(){ yield return "The Fellowship of the Ring"; yield return "The Two Towers"; yield return "The Return of the King"; }
The results can be processed using a traditional C# foreach
loop:
public void PrintTrilogyTitles(){ foreach(string title in LoTR()) Console.WriteLine(title); }
A powerful feature of Cω streams is that one can invoke methods on a stream which are then translated to subsequent method calls on each item in the stream. The following method shows an example of this feature in action:
public void PrintTrilogyTitleLengths(){ foreach(int size in LoTR().Length) Console.WriteLine(size); }
This method call results in the value of the Length
property being invoked on
each string returned by the PrintTrilogyTitles()
method. The ability to access
the properties of the contents of a stream allows XPath-style queries over object
graphs.
There is also the concept of an apply-to-all-expressions construct which allows one
to
apply an anonymous method directly to each member of a stream. These anonymous methods
may
contain the special variable it
which is bound to each successive element of
the iterated stream. Below is an alternate implementation of the
PrintTrilogyTitles()
method which uses this construct.
public void PrintTrilogyTitles(){ LoTR().{Console.WriteLine(it)}; }
Choice Types and Nullable Types
Cω's choice types are very similar to union types in programming languages like C
and
C++, the |
operator in DTDs, and the xs:choice
element in W3C XML
Schema. The following is an example of a class that uses a choice type
public class NewsItem{ string title; string author; choice {string pubdate; DateTime date;}; string body; }
In this example, an instance of the NewsItem
class can either have a
pubdate
field of type System.String
or a date
field
of type System.DateTime
but not both. It should be noted that the Cω
compiler enforces the constraint that each field in the choice should have a different
name,
otherwise there would be ambiguity as to what type was intended when the field is
accessed.
The way fields in a choice type are accessed in Cω differs from union types in C and
C++. In C and C++, the programmer has to keep track of what type the value in a particular
union type represents, since no static type checking is done by the compiler. Cω union
types are statically checked by the compiler which allows one the declare them as
such:
choice{string;DateTime;} x = DateTime.Now; choice{string;DateTime;} y = "12/12/2004";
However, there is still the problem of how to declare a type statically which may
depend
on a field in a choice type that does not exist. For example, in the above sample,
x.Length
is not a valid property access if the variable is initialized with
an instance of System.DateTime
but returns 10 when initialized with the string
"12/12/2004". This is where nullable types come into play.
In both the worlds of W3C XML Schema and relational databases, it is possible for
all types
to have instances whose value is null
. Java and existing versions of C# do not
allow one to assign null
to an integer or floating point value type. However,
when working with XML or relational data, it is valuable to be able to state that
null
is a valid value for a type. In such cases, you wouldn't want property
accesses on the value to result in NullReferenceExceptions
being thrown.
Nullable types make this possible by mapping all values returned by a property access
to the
value null. Below are some examples of using nullable types:
string? s = null; int? size = s.Length; // returns null instead of throwing NullReferenceException if(size == null) Console.WriteLine("The value of size is NULL"); choice{string;DateTime;} pubdate = DateTime.Now; int? dateLen = pubdate.Length; //works since it returns null because Length is a property of System.String int dateLen2 = (int) pubdate.Length; //throws NullReferenceException
This feature is similar to but not the same as nullable types in C#
2.0. A nullable type in C# 2.0 is an instance of Nullable<T>
which
contains a value and an indication whether the value is null or not. This is basically
a
wrapper for value types such as ints and floats that can't be null
. Cω
takes this one step further with the behavior of returning null
instead of
throwing a NullReferenceException
on accessing a field or property of a
nullable type whose value is null
.
Anonymous Structs
Anonymous structs are analogous to the xs:sequence
element in W3C XML Schema.
Anonymous structs enable one to model certain XML-centric notions in Cω, such as
document order and the fact that an element may have multiple child elements with
the same
name that have different values. An anonymous struct is similar to a regular struct
in C#
with a few key distinctions:
- There is no explicit type name for an anonymous struct.
- Fields within an anonymous struct are ordered, which allows them to be accessed via the array index operator.
- The fields within an anonymous struct do not have to have a name; they can just have a type.
- An anonymous struct can have multiple fields with the same name. In this case, accessing these fields by name results in a stream being returned.
- Anonymous structs with similar structure (i.e. same member types in the same order) are compatible, and variables of such structs can be assigned back & forth.
The following examples highlight the various characteristics of anonymous structs in Cω:
struct{ int; string; string; DateTime date; string;} x = new {47, "Hello World", "Dare Obasanjo", date=DateTime.Now, "This is my first story"}; Console.WriteLine(x[1]); DateTime pubDate = x.date; struct{ long; string; string; DateTime date; string;} newsItem = x; Console.WriteLine(newsItem[1] + " by " + newsItem[2] + " on " + newsItem.date); struct {string field; string field; string field;} field3 = new {field="one", field="two", field="three"}; string* strField = field3.field; //string strField = field3.field doesn't work since field3.field returns a stream struct {int value; string value;} tricky = new {value=10, value="ten"}; choice {int; string;}* values = tricky.value;
Content Classes
A content class is a class that has its members grouped into distinct units using
the
struct
keyword. To some degree, content classes are analogous to DTDs in XML.
The following content class for a Books
object
public class Books{ struct{ string title; string author; string publisher; string? onloan; }* Book; }
is analogous to the following DTD
<!ELEMENT Books (Book*)> <!ELEMENT Book (title, author,publisher, onloan?)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT publisher (#PCDATA)> <!ELEMENT onloan (#PCDATA)>
or the following W3C XML Schema
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Books"> <xs:complexType> <xs:sequence minOccurs="0" maxOccurs="0"> <xs:element name="Book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string" /> <xs:element name="author" type="xs:string" /> <xs:element name="publisher" type="xs:string" /> <xs:element name="onloan" type="xs:string" minOccurs="0"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
The following code sample shows how the Books
object would be used
using Microsoft.Comega; using System; public class Books{ struct{ string title; string author; string publisher; string? onloan; }* Book; public static void Main(){ Books books = new Books(); books.Book = new struct {title="Essential.NET", author="Don Box", publisher="Addison-Wesley", onloan = (string?) null}; Console.WriteLine((string) books.Book.author + " is the author of " + (string) books.Book.title); } }
Query Operators in Cω
Cω adds two broad classes of query operators to the C# language: XPath-based operators for querying the member variables of an object by name or by type and SQL-based operators for performing sophisticated queries involving projection, grouping, and joining of data from one or more objects.
XPath-Based Operators
With the existence of streams and anonymous structs that can have multiple members
with the
same name, even ordinary direct member access with the .
operator in Cω
can be considered a query operation. For example, the operation
books.Book.title
from the previous section returns the titles of all the
Book
objects contained within the Books
class. This is
equivalent to the XPath query /Books/Book/title
which returns all the titles of
Book
elements contained within the Books
element.
The wildcard member access operator .*
is used to retrieve all the fields of a
type. This operator is equivalent to the XPath query child::*
which returns all
the child nodes of the principal node type of the current node. For example, the operation
books.Book.*
returns a stream container of all the members of all the
Book
objects contained within the Books
class. This is
equivalent to the XPath query /Books/Book/*
which returns all the child
elements of the Book
elements contained within the Books
element.
Cω also supports transitive member access using the ...
operator which
is analogous to the descendent-or-self
axis or ..
abbreviated path
in XPath. The operation books...title
returns a stream containing all
title
member fields that are contained within the Books
class or
are member fields of any of its contents in a recursive manner. This is equivalent
to the
XPath query /Books//title
which returns all title
elements that
are descendants of the Books
element. The transitive member access operator can
also be used to match nodes restricted to a particular type using syntax of the form
...typename::*
. For example, the operation
books...string::*
returns a stream containing all the member fields of type
System.String
that are contained within the Books
class or are
member fields of any of its contents in a recursive manner. This is analogous to the
XPath
2.0 query /Books//element(*, xs:string)
which matches any descendent of
the Books
element of type xs:string
.
Filter operations can be applied to the results of a transitive member access in the
same
manner that predicates can be used to filter XPath queries. Just as in XPath, a Cω
filter is applied to a query operation by using the [expression]
operator placed after the query. As is the case with apply-to-all-expressions, the
filter
may contain the special variable it
which is bound to each successive element
of the iterated stream. Below is an example, which queries all the fields of type
System.Int32
in an anonymous struct then filters the results to those whose
value is greater than 8.
struct {int a; int b; int c;} z = new {a=5, b=10, c= 15}; int* values = z...int::*[it > 8]; foreach(int i in values){ Console.WriteLine(i + " is greater than 8"); }
SQL-Based Operators
Cω includes a number of constructs from the SQL language as keywords. Operators for performing selection with projection, filtering, ordering, grouping, and joins are all built into Cω. The SQL operators can be applied to both in-memory objects and relational stores that can be accessed via ADO.NET. When applied to a relational database, the Cω query operators are converted to SQL queries over the underlying store. The primary advantage of using the SQL operators from the Cω language is that the query syntax and results can be checked at compile time instead of at runtime, as is the case with embedding SQL expressions in strings using traditional relational APIs.
To connect to a SQL database in Cω, it must be exposed as a managed assembly (i.e.
a
.NET library file), which is then referenced by the application. A relational database
can
be exposed to a Cω as a managed assembly either by using the
sql2comega.exe
command line tool or the Add Database Schema...
dialog from within Visual Studio. Database
objects are used by Cω to
represent the relational database hosted by the server. A Database
object has a
public property for each table or view and a method for each table-valued function
found in
the database. To query a relational database, a table, view, or table-valued function
must
be specified as input to one or more of the SQL-based operators.
The following sample program and output shows some of the capabilities of using the
SQL-based operators to query a relational database in Cω. The database used in this
example is the sample Northwind database that comes with Microsoft SQL Server. The
name
DB
used in the example refers to a global instance of a Database object in
the Northwind
namespace of the Northwind.dll
assembly generated
using sql2comega.exe
.
using System; using System.Data.SqlTypes; using Northwind; class Test { static void Main() { // The foreach statement can infer the type of the iteration variable 'row' // statically evaluating the result type of the select expression foreach( row in select ContactName from DB.Customers ) { Console.WriteLine("{0}", row.ContactName); } } }
The following sample program and output shows some of the capabilities of using the SQL-based operators to query in-memory objects in Cω:
Code Sample using Microsoft.Comega; using System; public class Test{ enum CDStyle {Alt, Classic, HipHop} static struct{ string Title; string Artist; CDStyle Style; int Year;}* CDs = new{new{ Title="Lucky Frog", Artist="Holly Holt", Style=CDStyle.Alt, Year=2001}, new{ Title="Kamikaze", Artist="Twista", Style=CDStyle.HipHop, Year=2004}, new{ Title="Stop Light Green", Artist="Robert O'Hara", Style=CDStyle.Alt, Year=1981}, new{ Title="Noctures", Artist="Chopin", Style=CDStyle.Classic, Year=1892}, new{ Title="Mimosa!", Artist="Brian Groth", Style=CDStyle.Alt, Year=1980}, new {Title="Beg For Mercy", Artist="G-Unit", Style=CDStyle.HipHop, Year=2003} }; public static void Main(){ struct { string Title; string Artist;}* results; Console.WriteLine("QUERY #1: select Title, Artist from CDs where Style == CDStyle.HipHop"); results = select Title, Artist from CDs where Style == CDStyle.HipHop; results.{ Console.WriteLine("Title = {0}, Artist = {1}", it.Title, it.Artist); }; Console.WriteLine(); struct { string Title; string Artist; int Year;}* results2; Console.WriteLine("QUERY #2: select Title, Artist, Year from CDs order by Year"); results2 = select Title, Artist, Year from CDs order by Year; results2.{ Console.WriteLine("Title = {0}, Artist = {1}, Year = {2}", it.Title, it.Artist, it.Year); }; } } Output QUERY #1: select Title, Artist from CDs where Style == CDStyle.HipHop Title = Kamikaze, Artist = Twista Title = Beg For Mercy, Artist = G-Unit QUERY #2: select Title, Artist, Year from CDs order by Year Title = Noctures, Artist = Chopin, Year = 1892 Title = Mimosa!, Artist = Brian Groth, Year = 1980 Title = Stop Light Green, Artist = Robert O'Hara, Year = 1981 Title = Lucky Frog, Artist = Holly Holt, Year = 2001 Title = Beg For Mercy, Artist = G-Unit, Year = 2003 Title = Kamikaze, Artist = Twista, Year = 2004
A number of operations that require tedious nested loops can be processed in a straightforward manner using the declarative SQL-like operators in Cω. Provided below is a brief description of the major classes of SQL operators included in Cω.
Projection
The projection of the select expression is the list of expressions following the
select
keyword. The projection is executed once for each row specified by the
from
clause. The job of the projection is to shape the resulting rows of data
into rows containing only the columns required. The simplest form of the select command
consists of the select
keyword followed by a projection list of one or more
expressions identifying columns from the source, followed by the from
keyword
and then an expression identifying the source of the query. For example,
rows = select ContactName, Phone from DB.Customers; foreach( row in rows ) { Console.WriteLine("{0}", row.ContactName); }
In the above example, the type designator for the results of the select query are not specified. The Cω compiler automatically infers the correct type. The actual type of an individual result row is a Cω tuple type. One can specify the result type directly using a tuple type (i.e. an anonymous struct) and the asterisk (*) to designate a stream of results. For example,
struct{SqlString ContactName;}* rows = select ContactName from DB.Customers; struct{SqlString ContactName; SqlString Phone;}* rows = select ContactName, Phone from DB.Customers;
Filtering
The results of a select expression can be filtered using one of three keywords;
distinct
, top
and where
. The distinct
keyword is used to restrict the resulting rows to only unique values. The top
keyword is used to restrict the total number of rows produced by the query. The keyword
top
is followed by a constant numeric expression which specifies the number
of rows to return. One can also create a distinct top
selection which restricts
the total number of unique rows returned by the query. The where
clause is used
to specify a Boolean expression for filtering the rows returned by the query source.
Rows
where the expression evaluates to true are retained, while the rest are discarded.
The
example below shows all three filter operators in action:
select distinct top 10 ContactName from DB.Customers where City == "London";
Ordering
The resulting rows from a select expression can be sorted by using the order
by
clause. The order by clause is always the last clause of the select expression,
if it is specified at all. The order by
clause consists of the two keywords
order by
followed immediately by a comma separated list of expressions that
define the values that will determine the order. The first expression in the list
defines
the ordering criteria with the most precedence. It is also possible to specify whether
each
expression should be considered in ascending or descending order. The default for
all
expressions is to be considered in ascending order. The example below shows the order
by
clause in action.
rows = select ContactName, Phone from DB.Customers order by ContactName desc, Phone asc;
Grouping
Values can be aggregated across multiple rows using the group by
clause and
the built-in aggregate functions. The group by
clause enables one to specify
how different rows are actually related so they can be grouped together. Aggregate
functions
can then be applied to the columns to compute values over the group. An aggregate
function
is a function that computes a single value from a series of inputs, such as computing
the
sum or average of a series of numbers. There are six aggregate functions built into
Cω.
They are Count, Min, Max, Avg, Sum and Stddev. To use these functions in a query,
one must
first import the System.Query
namespace. The following example shows how to use
the group by
clause and the built-in aggregate functions.
rows = select Country, Count(Country) from DB.Customers group by Country;
This example uses an aggregate to produce the set of all countries and the count
of
customers within each country. The Count()
aggregate tallies the number of
items in the group.
Aggregates can be used in all clauses evaluated after the group by
clause.
The projection list is evaluated after the group by
clause, even though it is
specified earlier in the query. A consequence of this is that aggregate functions
can not be
applied in the where
clause. However, it is still possible to filter grouped
rows by using the having
clause. The having
clause acts just like
the where
clause, except that it is evaluated after the group by
clause. The following example shows how the having
clause is used.
rows = select Country, Count(Country) from DB.Customers group by Country having Count(Country) > 1;
Joins
Select queries can be used to combine results from multiple tables. A SQL joining
is a
Cartesian product of one or more tables where each row from one table is paired up
with each
row from another table. The full Cartesian product consists of all such pairings.
To select
multiple sources whose data should be joined to perform a query, the from
clause can actually contain a comma-separated list of source expressions, each with
its own
iteration alias. The following example pairs up all Customer rows with their corresponding
Order rows, and produces a table listing the customer's name and the shipping date
for the
order.
rows = select c.ContactName, o.ShippedDate from c in DB.Customers, o in DB.Orders where c.CustomerID == o.CustomerID;
Cω also supports more sophisticated table-joining semantics from the SQL world
including inner join
, left join
, right join
, and
outer join
using the corresponding keywords. Explanations of the semantics of
the various kinds of joins is available at the W3Schools tutorial on SQL JOIN. The
following example shows a select expression that uses the inner join
keyword:
rows = select c.ContactName, o.ShippedDate from c in DB.Customers inner join o in DB.Orders on c.CustomerID == o.CustomerID;
Data Modification
The relational data access capabilities of Cω are not limited to querying data. One
can also insert new rows into a table using the insert
command, modify existing
rows within a table using the update
command, or delete rows from a table using
the delete
command.
The insert
command is an expression that evaluates to the number of
successful inserts that have been made to the table as a result of executing the command.
The following example inserts a new customer in the Customers table:
int n = insert CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme" into DB.Customers;
The same effect can be obtained by using an anonymous struct as opposed to setting each field directly as shown below:
row = new{CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme"}; int n = insert row into DB.Customers;
The update
command is an expression that evaluates to the number of rows that
were successfully modified as a result of executing the command. The following example
shows
how to do a global replace of all misspelled references to the city "London":
int n = update DB.Customers set City = "London" where Country == "UK" && City == "Lundon";
It is also possible to modify all rows in the table by omitting the where
clause. The delete
command is an expression that evaluates to the number of
rows that were successfully deleted as a result of executing the command. The following
example deletes all the orders for customers in London:
int n = delete o from c in DB.Customers inner join o in DB.Orders on c.CustomerID == o.CustomerID where c.City == "London";
Most applications that use insert, update, and delete expressions will also use
transactions to guarantee ACIDity
(atomicity, consistency, isolation, and durability) of one or more changes to the
database. The Cω language includes a transact
statement that promotes the
notion of initiating and exiting transactions in a programming language feature. The
transact
statement is a statement bounding a block of code associated with a
database transaction. If the code executes successfully and attempts to exit the block,
the
transaction will automatically be committed. If the block is exited due to a thrown
exception, the transaction will automatically be aborted. The developer can also specify
handlers that execute some user-defined code once the transact block is exited using
the
commit
and rollback
handlers. The following example shows
transactions in action:
transact(DB) { delete from DB.Customers where CustomerID == "ALFKI"; } commit { Console.WriteLine("commited"); } rollback { Console.WriteLine("aborted"); }
Using XML Literals in Cω
In Cω one can construct object instances using XML syntax. This feature is modelled after the ability to construct elements in languages like XQuery and XSLT. Like XQuery and XSLT, the XML literals can contain embedded code for constructing values. However, since Cω is statically typed, the names of members and types must be known at compile time and thus can not be constructed dynamically.
One can control the shape of the XML using a number of constructs. One can specify
that
fields should be treated as attributes in the XML literal with the use of the
attribute
modifier on the member declaration. Similarly, choice types and
anonymous structs are treated as the children of the XML element that maps to the
content
class. The following example shows how to initialize a content class from an XML literal:
using Microsoft.Comega; using System; public class NewsItem{ attribute string title; attribute string author; struct { DateTime date; string body; } public static void Main(){ NewsItem news = <NewsItem title="Hello World" author="Dare Obasanjo"> <date>{DateTime.Now}</date> <body>I am the first post of the New Year.</body> </NewsItem>; Console.WriteLine(news.title + " by " + news.author + " on " + news.date); } }
XML literals are intended to make the process of constructing strongly typed XML
much
simpler. Consider the following XQuery example taken from the W3C XQuery Use cases
document.
It iterates over a bibliogrpahy that contains a number of books. For each book in
the
bibliography, it lists the title and authors, grouped inside a result
element.
for $b in $bs/book return <result> {$b/title} {$b/author} </result>
Performing the equivalent task in Cω looks very similar.
foreach (b in bs.book){ yield return <result> {b.title} {b.author} </result> }
Conclusion
The Cω language is an interesting attempt to bridge the impedence mismatches involved in typical enterprise development efforts when crossing the boundaries of the relational, object oriented, and XML worlds. A number of the ideas in the language have taken hold in academia and within Microsoft as is evidenced by comments by Anders Hejlsberg about the direction of C# 3.0. Until then, developers can explore what it means to program in a more data-oriented programming language by downloading the Cω compiler or Visual Studio.NET plugin
Acknowledgements
I'd like to thank Don Box, Mike Champion, Denise Draper, and Erik Meijer for their ideas and feedback while writing this article.