It was the year 1994 that the web came out of the shadow of academia and onto the everyone’s screens. In particular, it was the second half of the second week of December 1994 that capped off the year with three eventful days.
Members of the World Wide Web Consortium huddled around a table at MIT on Wednesday, December 14th. About two dozen people made it to the meeting, representatives from major tech companies, browser makers, and web-based startups. They were there to discuss open standards for the web.
When done properly, standards set a technical lodestar. Companies with competing interests and priorities can orient themselves around a common set of agreed upon documentation about how a technology should work. Consensus on shared standards creates interoperability; competition happens through user experience instead of technical infrastructure.
The World Wide Web Consortium, or W3C as it is more commonly referred to, had been on the mind of the web’s creator, Sir Tim Berners-Lee, as early as 1992. He had spoken with a rotating roster of experts and advisors about an official standards body for web technologies. The MIT Laboratory for Computer Science soon became his most enthusiastic ally. After years of work, Berners-Lee left his job at CERN in October of 1994 to run the consortium at MIT. He had no intention of being a dictator. He had strong opinions about the direction of the web, but he still preferred to listen.
On the agenda — after the table had been cleared with some basic introductions — was a long list of administrative details that needed to be worked out. The role of the consortium, the way it conducted itself, and its responsibilities to the wider web was little more than sketched out at the beginning of the meeting. Little by little, the 25 or so members walked through the list. By the end of the meeting, the group felt confident that the future of web standards was clear.
The next day, December 15th, Jim Clark and Marc Andreessen announced the recently renamed Netscape Navigator version 1.0. It had been out for several months in beta, but that Thursday marked a wider release. In a bid for a growing market, it was initially given away for free. Several months later, after the release of version 1.1, Netscape would be forced to walk that back. In either case, the browser was a commercial and technical success, improving on the speed, usability, and features of browsers that had come before it.
On Friday, December 16th, the W3C experienced its first setback. Berners-Lee never meant for MIT to be the exclusive site of the consortium. He planned for CERN, the birthplace of the web and home to some of its greatest advocates, to be a European host for the organization. On December 16th, however, CERN approved a massive budget for its Large Hadron Collider, forcing them to shift priorities. A refocused budget left little room for hypertext Internet experiments not directly contributing to the central project of particle physics.
CERN would no longer be the European host of the W3C. All was not lost. Months later, the W3C set up at France’s National Institute for Research in Computer Science and Control, or INRIA. By 1996, a third site at Japan’s Keio University would also be established.
Far from an outlier, this would not be the last setback the W3C ever faced, or that it would overcome.
In 1999, Berners-Lee published an autobiographical account of the web’s creation in a book entitled Weaving the Web. It is a concise and even history, a brisk walk through the major milestones of the web’s first decade. Throughout the book, he often returns to the subject of the W3C.
He frames the web consortium, first and foremost, as a matter of compromise. “It was becoming clear to me that running the consortium would always be a balancing act, between taking the time to stay as open as possible and advancing at the speed demanded by the onrush of technology.” Striking a balance between shared compatibility and shorter and shorter browser release cycles would become a primary objective of the W3C.
Web standards, he concedes, thrives through tension. Standards are developed amidst disagreement and hard-won bargains. Recalling a time just before the W3C’s creation, Berners-Lee notes how the standards process reflects the structure of the web. “It struck me that these tensions would make the consortium a proving ground for the relative merits of weblike and treelike societal structures,” he wrote, “I was eager to start the experiment.” A web consortium born of compromise and defined by tension, however, was not Berners-Lee’s first plan.
In March of 1992, Berners-Lee flew to San Diego to attend a meeting of the Internet Engineering Task Force, or IETF. Created in 1986, the IETF develops standards for the Internet, ranging from networking to routing to DNS. IETF standards are unenforceable and entirely voluntarily. They are not sanctioned by any world government or subject to any regulations. No entity is obligated to use them. Instead, the IETF relies on a simple conceit: interoperability helps everyone. It has been enough to sustain the organization for decades.
Because everything is voluntary, the IETF is managed by a labyrinthine set of rules and ritualistic processes that can be difficult to understand. There is no formal membership, though anyone can join (in its own words it has “no members and no dues”). Everyone is a volunteer, no one is paid. The group meets in person three times a year at shifting locations.
The IETF operates on a principle known as rough consensus (and, often times, running code). Rather than a formal voting process, disputed proposals need to come to some agreement where most, if not at all, of the members in a technology working group agree. Working group members decide when rough consensus has been met, and its criteria shifts form year to year and group to group. In some cases, the IETF has turned to humming to take the temperature of a room. “When, for example, we have face-to-face meetings… instead of a show of hands, sometimes the chair will ask for each side to hum on a particular question, either ‘for’ or ‘against’.”
It is against the backdrop of these idiosyncratic rules that Berners-Lee first came to the IETF in March of 1992. He hoped to set up a working group for each of the primary technologies of the web: HTTP, HTML, and the URI (which would later be renamed to URL through the IETF). In March he was told he would need another meeting, this one in June, to formally propose the working groups. Somewhere close to the end of 1993, a year and a half after he began, he had persuaded the IETF to set up all three.
The process of rough consensus can be slow. The web, by contrast, had redefined what fast could look like. New generations of browsers were coming out in months, not years. And this was before Netscape and Microsoft got involved.
The development of the web had spiraled outside Berners-Lee’s sphere of influence. Inline images — a feature maybe most responsible for the web’s success — was a product of a late night brainstorming session over snacks and soda in the basement of a university lab. Berners-Lee learned about it when everyone else did, when Marc Andreessen posted it to the www-talk mailing list.
Tension. Berners-Lee knew that it would come. He had hoped, for instance, that images might be treated differently (“Tim bawled me out in the summer of ’93 for adding images to the thing,” Andreessen would later say), but the web was not his. It was not anybody’s. He had designed it that way.
With all of its rules and rituals, the IETF did not seem like the right fit for web standards. In private discussions at universities and research labs, Berners-Lee had begun to explore a new path. Something like a consortium of stakeholders in the web — a collection of companies that create browsers and websites and software — that can come together to agree upon a rough consensus for themselves. By the end of 1993, his work on the W3C had already begun.
Dave Raggett, a seasoned researcher at Hewlett-Packard, had a different view of the web. He wasn’t from academia, and he wasn’t working on a browser (not yet anyway). He understood almost instinctively the utility of the web as commercial software. Something less like a digital phonebook and more like Apple’s wildly successful Hypercard application.
Unable to convince his bosses of the web’s promise, Raggett used the ten percent of time HP allowed for its employees to pursue independent research to begin working with the web. He anchored himself to the community, an active member of the www-talk mailing list and a regular presence at IETF meetings. In the fall of 1992, he had a chance to visit with Berners-Lee at CERN.
It was around this time that he met Yuri Rubinsky, an enthusiastic advocate for Standard General Markup Language, or SGML, the language that HTML was originally based on. Rubinsky believed that the limitations of HTML could be solved by a stricter adherence to the SGML standard. He had begun a campaign to bring SGML to the web. Raggett agreed — but to a point. He was not yet ready to sever ties with HTML.
Each time Mosaic shipped a new version, or a new browser was released, the gap between the original HTML specification and the real world web widened. Raggett believed that a more comprehensive record of HTML was required. He began working on an enhanced version of HTML, and a browser to demo its capabilities. Its working title was HTML+.
Ragget’s work soon began to spill over to his home life. He’d spend most nights “at a large computer that occupied a fair portion of the dining room table, sharing its slightly sticky surface with paper, crayons, Lego bricks and bits of half-eaten cookies left by the children.” After a year of around the clock work, Raggett had a version of HTML+ ready to go in November of 1993. His improvements to the language were far from superficial. He had managed to add all of the little things that had made their way into browsers: tables, images with captions and figures, and advanced forms.
Several months later, in May of 1994, developers and web enthusiasts traveled from all over the world to come to what some attendees would half-jokingly refer to as the “Woodstock of the Web,” the first official web conference organized by CERN employee and web pioneer Robert Calliau. Of the 800 people clamoring to come, the space in Geneva could hold only 350. Many were meeting for the first time. “Everyone was milling about the lobby,” web historian Marc Weber would later describe, “electrified by the same sensation of meeting face-to-face actual people who had been just names on an email or on the www-talk [sic] mailing list.”
It came at a moment when the web stood on the precipice of ubiquity. Nobody from the Mosaic team had managed to make it (they had their own competing conference set for just a few months later), but there were already rumors about Mosaic alum Marc Andresseen’s new commercial browser that would later be called Netscape Navigator. Mosaic, meanwhile, had begun to license their browser for commercial use. An early version of Yahoo! was growing exponentially as more and more publications, like GNN, Wired, The New York Times, and The Wall Street Journal, came online.
Progress at the IETF, on the other hand, had been slow. It was too meticulous, too precise. In the meantime, browsers like Mosaic had begun to add whatever they wanted — particularly to HTML. Tags supported by Mosaic couldn’t be found anywhere else, and website creators were forced to chose between cutting-edge technology and compatibility with other browsers. Many were choosing the former.
HTML+ was the biggest topic of conversation at the conference. But another highlight was when Dan Connolly — a young, “red-haired, navy-cut Texan” who worked at the supercomputer manufacturer Convex — took the stage. He gave a talk called “Interoperability: Why Everyone Wins.” Later, and largely because of that talk, Connolly would be made chair of the IETF HTML Working Group.
In a prescient moment capturing the spirit of the room, Connolly described a future when the language of HTML fractured. When each browser implemented their own set of HTML tags in an effort to edge out the competition. The solution, he concluded, was an HTML standard that was able to evolve at the pace of browser development.
Ragget’s HTML+ made a strong case for becoming that standard. It was exhaustive, describing the new HTML used in browsers like Mosaic in near-perfect detail. “I was always the minimalist, you know, you can get it done with out that,” Connolly later said, “Raggett, on the other hand, wanted to expand everything.” The two struck an agreement. Raggett would continue to work through HTML+ while Connolly focused on a more narrow upgrade.
Connolly’s version would soon become HTML 2, and after a year of back and forth and rough consensus building at the IETF, it became an official standard. It didn’t have nearly the detail of HTML+, but Connolly was able to officially document features that browsers had been supporting for years.
Ragget’s proposal, renamed to HTML 3, was stuck. In an effort to accommodate an expanding web, it continued to grow in size. “To get consensus on a draft 150 pages long and about which everyone wanted to voice an opinion was optimistic – to say the least,” Raggett would later put it, rather bluntly. But by then, Raggett was already working at the W3C, where HTML 3 would soon become a reality.
Berners-Lee also spoke at the first web conference in Geneva, closing it out with a keynote address. He didn’t specifically mention the W3C. Instead, he focused on the role of web. “The people present were the ones now creating the Web,” he would later write of his speech, “and therefore were the only ones who could be sure that what the systems produced would be appropriate to a reasonable and fair society.”
In October of 1994, he embarked on his own part in making a more equitable and accessible future for the web. The World Wide Web Consortium was officially announced. Berners-Lee was joined by a handful of employees — a list that included both Dave Raggett and Dan Connolly. Two months later, in the second half of the second week of December of 1994, the members of the W3C met for the first time.
Before the meeting, Berners-Lee had a rough sketch of how the W3C would work. Any company or organization could join given that they pay the membership fee, a tiered pricing structure tied to the size of that company. Member organizations would send representatives to W3C meetings, to provide input into the process of creating standards. By limiting W3C proceedings to paying members, Berners-Lee hoped to focus and scope the conversations to real world implementations of web technologies.
Yet despite a closed membership, the W3C operates in the open whenever possible. Meeting notes and documentation are open to anybody in the public. Any code written as part of experiments in new standards is freely downloadable.
Gathered at MIT, the W3C members had to next decide how its standards would work. They decided on a process that stops just short of rough consensus. Though they are often called standards, the W3C does not create official standards for the web. The technical specifications created at the W3C are known, in their final form, as recommendations.
They are, in effect, proposals. They outline, in great detail, how exactly a technology works. But they leave enough open that it is up to browsers to figure out exactly how the implementation works. “The goal of the W3C is to ensure interpretability of the Web, and in the long range that’s realistic,” former head of communications at the W3C Sally Khudairi once described it, “but in the short range we’re not going to play Web cops for compliance… we can’t force members to implement things.”
Initial drafts create a feedback loop between the W3C and its members. They provide guidance on web technologies, but even as specifications are in the process of being drafted, browsers begin to introduce them and developers are encouraged to experiment with them. Each time issues are found, the draft is revised, until enough consensus has been reached. At that point, a draft becomes a recommendation.
There would always be tension, and Berners-Lee knew that well. The trick was not to try to resist it, but to create a process where it becomes an asset. Such was the intended effect of recommendations.
At the end of 1995, the IETF HTML working group was replaced by a newly created W3C HTML Editorial Review Board. HTML 3.2 would be the first HTML version released entirely by the W3C, based largely on Ragget’s HTML+.
There was a year in web development, 1997, when browsers broke away from the still-new recommendations of the W3C. Microsoft and Netscape began to release a new set of features separate and apart from agreed upon standards. They even had a name for them. They called them Dynamic HTML, or DHTML. And they almost split the web in two.
DHTML was originally celebrated. Dynamic meant fluid. A natural evolution from HTML’s initial inert state. The web, in other words, came alive.
Touting it’s capabilities, a feature in Wired in 1997 referred to DHTML as the “magic wand Web wizards have long sought.” In its enthusiasm for the new technology, it makes a small note that “Microsoft and Netscape, to their credit, have worked with the standards bodies,” specifically on the introduction of Cascading Style Sheets, or CSS, but that most features were being added “without much regard for compatibility.”
The truth on the ground was that using DHTML required targeting one browser or another, Netscape or Internet Explorer. Some developers chose to simply choose a path, slapping a banner at the bottom of their site that displayed “Best Viewed In…” one browser or another. Others ignored the technology entirely, hoping to avoid its tangled complexity.
Browsers had their reasons, of course. Developers and users were asking for things not included in the official HTML specification. As one Microsoft representative put it, “In order to drive new technologies into the standards bodies, you have to continue innovating… I’m responsible to my customers and so are the Netscape folks.”
A more dynamic web was not a bad thing, but a splintered web was untenable. For some developers, it would prove to be the final straw.
Following the release of HTML 3.2, and with the rapid advancement of browsers, the HTML Editorial Review Board was divided into three parts. Each was given a separate area of responsibility to make progress on, independent of the others.
Dr. Lauren Wood became chair of the Document Object Model Working Group. A former theoretical nuclear phycist, Wood was the Director of Product Technology at SoftQuad, a comapny founded by SGML advocate Yuri Rubinsky. While there, she helped work on the HoTMetaL HTML editor. The DOM spec created a standardized way for browsers to implement Dynamic HTML. “You need a way to tie your data and your programs together,” was how Wood described it, “and the Document Object Model is that glue.” Her work on the Document Object Model, and later XML, would have a long-lasting influence on the web.
The Cascading Style Sheets Working Group was chaired by Chris Lilley. Lilley’s background was in computer graphics, as a teacher and specialist in the Computer Graphics Unit at the University of Manchester. Lilley had worked at the IETF on the HTML 2 spec, as well as a specification for Portable Network Graphics (PNG), but this would mark his first time as a working group chair.
CSS was still a relative newcomer in 1997. It had been in the works for years, but had yet to have a major release. Lilley would work alongside the creators of CSS — Håkon Lie and Bert Bos — to create the first CSS standard.
The final working group was for HTML, left under the auspices of Dan Connolly, continuing his position from the IETF. Connolly had been around the web almost as long as Berners-Lee had. He was one of the people watching back in October of 1991, when Berners-Lee demoed the web for a small group of unimpressed people at a hypertext conference in San Antonio. In fact, it was at that conference that he first met the woman that would later become his wife.
After he returned home, he experimented with the web. He messaged Berners-Lee a month later. It was only three words:“You need a DTD.”
When Berners-Lee developed the language of HTML, he borrowed its convention from a predecessor, SGML. IBM developed Generalized Markup Language (GML) in the early 1970’s to make it easier for typists to create formatted books and reports. However, it quickly got out of control, as people would take shortcuts and use whatever version of the tags that they wanted.
That’s when they developed the Document Type Definition, or as Connolly called it, a DTD. DTDs are what added the “S” (Standardized) to GML. Using SGML, you can create a standardized set of instructions for your data, its scheme and its structure, to help computers understand how to interpret it. These instructions are a document type definition.
Beginning with version 2, Connolly added a type definition to HTML. It limited the language to a smaller set of agreed-upon tags. In practice, browsers treated this more as a loose definition, continuing to implement their own DHTML features and tags. But it was a first step.
In 1997, the HTML Working Group, now inside of the W3C, began to work on the fourth iteration of HTML. It expanded the language, adding to the specification far more advanced features, complex tables and forms, better accessibility, and a more defined relationship with CSS. But it also split HTML from a single schema into three different document type definitions for browsers to adopt.
The first, Frameset, was not typically used. The second, Transitional, was there to include the mistakes of the past. It expanded a larger subset of HTML that included non-standard, presentational HTML that browsers had used for years, such as <font>
and <center>
. This was set as a default for browsers.
The third DTD was called Strict. Under the Strict definition, HTML was pared down to only its standard, non-presentational features. It removed all of the unique tags introduced by Netscape and Microsoft, leaving only structured elements. If you use HTML today, it likely draws on the same base of tags.
The Strict definition drew a line in the sand. It said, this is HTML. And it finally gave a way for developers to code once for every browser.
In the August 1998 issue of Computerworld — tucked between large features on the impending doom of <abbr title=”Year 2000>Y2K, the bristling potential of billing on the World Wide Web, and antitrust concerns about Microsoft — was a small announcement. Its headline read, ”Browser standards targeted.” It was about the creation of a new grassroots organization of web developers aimed at bringing web standards support to browsers. It was called the Web Standards Project.
Glenn Davis, co-creator of the project, was quoted in the announcement. “The problem is, with each generation of the browser, the browser manufacturers diverge farther from standards support.” Developers, forced to write different code for different browsers for years, had simply had enough. A few off-hand conversations in mailing lists had spiraled into a fully grown movement. At launch, 450 developers and designers had already signed up.
Davis was not new to the web, and he understood its challenges. His first experience on the web dated all the way back to 1994, just after Mosaic had first introduced inline images, when he created the gallery site Cool Site of the Day. Each day, he would feature a single homepage from an interesting or edgy or experimental site. For a still small community of web designers, it was an instant hit.
There was no criteria other than sites that Davis thought were worth featuring. “I was always looking for things that push the limits,” was how he would later define it. Davis helped to redefine the expectations of the early web, using the moniker coolas a shorthand to encompass many possibilities. Dot-com Design author and media professor **Megan Ankerson points out what “this ecosystem of cool sites gestured towards the sheer range of things the web could be: its temporal and spatial dislocations, its distinction from and extension of mainstream media, its promise as a vehicle for self-publishing, and the incredible blend of personal, mundane, and extraordinary.” For a time on the web, Davis was the arbiter of cool.
As time went on Davis transformed his site into Project Cool, a resource for creating websites. In the days of DHTML, Davis’ Project Cool tutorials provided constructive and practical techniques for making the most out of the web. And a good amount of his writing was devoted to explaining how to write code that was usable in both Netscape Navigator and Microsoft’s Internet Explorer. He eventually reached a breaking point, along with many others. At the end of 1997, Netscape and Microsoft both released their 4.0 browsers with spotty standards support. It was already clear that upcoming 5.0 releases were planning to lean even further into uneven and contradictory DHTML extensions.
Running out of patience, Davis helped set up a mailing list with George Olsen and Jeffrey Zeldman. The list started with two dozen people, but it gathered support quickly. The Web Standards Project, known as WaSP, officially launched from that list in August of 1998. It began with a few hundred members and announcement in magazines like Computer World. Within a few months, it would have tens of thousands of members.
The strategy for WaSP was to push browsers — publicly and privately — into web standards support. WaSP was not meant to be a hyperbolic name.” The W3C recommends standards. It cannot enforce them,” Zeldman once said of the organization’s strategy, “and it certainly is not about to throw public tantrums over non-compliance. So we do that job.”
A prominent designer and standards advocate, Zeldman would have an enduring influence on makers of the web. He would later run WaSP during some of its most influential years. His website and mailing list, A List Apart, would become a gathering place for designers who cared about web standards and using the latest web technologies.
WaSP would change focus several times during their decade and a half tenure. They pushed browsers to make better use of HTML and CSS. They taught developers how write standards-based code. They advocated for greater accessibility and tools that supported standards out of the box.
But their mission, published to their website on the first day of launch, would never falter. “Our goal is to support these core standards and encourage browser makers to do the same, thereby ensuring simple, affordable access to Web technologies for all.”
WaSP succeeded in their mission on a few occasions early on. Some browsers, notably Opera, had standards baked in at the beginning; their efforts were praised by WaSP. But the two browsers that collectively made up a majority of web use — Internet Explorer and Netscape Navigator — would need some work.
A four billion dollar sale to AOL in 1998 was not enough for Netscape to compete with Microsoft. After the release of Netscape 4.0, they doubled-down on bold strategy, choosing to release the entire browser’s code as open source under the Mozilla project. Everyday consumers could download it for free; coders were encouraged to contribute directly.
Members of the community soon noticed something in Mozilla. It had a new rendering engine, often referred to as Gecko. Unlike planned releases of Netscape 5, which had patchy standards support at best, Gecko supported a fairly complete version of HTML 4 and CSS.
WaSP diverted their formidable membership to the task of pushing Netscape to include Gecko in its next major release. One familiar WaSP tactic was known as roadblocking. Some of its members worked at publications like HotWired and CNet. WaSP would coordinate articles across several outlets all at once criticizing, for instance, Netscape’s neglect of standards in the face of a perfectly reasonable solution in Gecko. By doing so, they were often able to capture the attention of at least one news cycle.
WaSP also took more direct action. Members were asked to send emails to browsers, or sign petitions showing widespread support for standards. Overwhelming pressure from developers was occasionally enough to push browsers in the right direction.
In part because of WaSP, Netscape agreed to make Gecko part of version 5.0. Beta versions of Netscape 5 would indeed have standards-compliant HTML and CSS, but it was beset with issues elsewhere. It would take years for a release. By then, Microsoft’s dominion over the browser market would be near complete.
As one of the largest tech companies in the world, Microsoft was more insulated from grassroots pressure. The on-the-ground tactics of WaSP proved less successful when turned against the tech giant.
But inside the walls of Microsoft, WaSP had at least one faithful follower, developer Tantek Çelik. Çelik has tirelessly fought on the side of web standards as far back as his web career stretches. He would later become a member of the WaSP Steering Committee and a representative for a number of working groups at the W3C working directly on the development of standards.
Çelik ran a team inside of Internet Explorer for Mac. Though it shared a name, branding, and general features with its far more ubiquitous Windows counterpart, IE for Mac ran on a separate codebase. Çelik’s team was largely left to its own devices in a colossal organization with other priorities working on a browser that not many people were using.
With the direction of the browser largely left up to him, Çelik began to reach out to web designers in San Francisco at the cutting edge of web technology. Through a stroke of luck he was connected to several members of the Web Standards Project. He’d visit with them and ask what they wanted to see in the Mac IE browser. “The answer: better standards support.”
They helped Çelik realize that his work on a smaller browser could be impactful. If he was able to support standards, as they were defined by the W3C, it could serve as a baseline for the code that the designers were writing. They had enough to worry about with buggy standards in IE for Windows and Netscape, in other words. They didn’t need to also worry about IE for Mac.
That was all that Çelik needed to hear. When Internet Explorer 5.0 for Mac launched in 2000, it had across the board support for web standards; HTML, PNG images, and most impressively, one of the most ambitious implementations of the new Cascading Style Sheets (CSS) specification.
It would take years for the Windows version to get anywhere close to the same kind of support. Even half a decade later, after Çelik left to work at the search engine Technorati, they were still playing catch-up.
Towards the end of the millennium, the W3C found themselves at a fork in the road. They looked to their still-recent past and saw it filled with contentious support for standards — Incompatible browsers with their own priorities. Then they looked the other way, to their towering future. They saw a web that was already evolving beyond the confines personal computers. One that would soon exist on TVs and in cell phones and on devices we that hadn’t been dreamed up yet in paradigms yet to be invented. Their past and their future were incompatible. And so, they reacted.
Yuri Rubinsky had an unusual talent for making connections. In his time as a standards advocate, developer, and executive at a major software company, he had managed to find time to connect some of the web’s most influential proponents. Sadly, Rubinsky died suddenly and at a young age in 1996, but his influence would not soon be forgotten. He carried with him an infectious energy and a knack for persuasion. His friend and colleague Peter Sharpe would say upon his death that in “talking to the people from all walks of life who knew Yuri, there was a common theme: Yuri had entered their lives and changed them forever.”
Rubinsky devoted his career to making technology more accessible. He believed that without equitable access, technology was not worth building. It motivated all of the work he did, including his longstanding advocacy of SGML.
SGML is a meta-language and “you use it to build your own computer languages for your own purposes.” If you hand a document over to a computer, SGML is how you can give that computer instructions on how to understand it. It provides a standardized way to describe the structure of data — the tags that it uses and the order it is expected in. The ownership of data, therefore, is not locked up and defined at some unknown level, it is given to everybody.
Rubinsky believed in that kind of universal access, a world in which machines talked to each other in perfect harmony, passing sets of data between them, structured, ordered, and formatted for its users. His company, SoftQuad, built software for SGML. He organized and spoke at conferences about it. He created SGML Open, a consortium not unlike the W3C. “SGML provides an internationally standardized, vendor-supported, multi-purpose, independent way of doing business,” was how he once described it, “If you aren’t using it today, you will be next year.” He was almost right.
He had a mission on the web as well. HTML is actually based on SGML, though it uses only a small part of it. Rubinsky was beginning to have conversations with members of the W3C, like Berners-Lee and Raggett, about bringing a more comprehensive version of SGML to the web. He was even writing a book called SGML on the Web before his death.
In the hallways of conferences and in threaded mailing lists, Rubinsky used his unique propensity for persuasion to bring people several people together on the subject, including Dan Connolly, Lauren Wood, Jon Bosak, James Clark, Tim Bray, and others. Eventually, those conversations moved into the W3C. They formed a formal working group and, in November of 1996, eXtensible Markup Language (XML) was formally announced, and then adopted as a W3C Recommendation. The announcement took place at an annual SGML conference in Boston, run by an organization where Rubinsky sat on the Board of Directors.
XML is SGML, minus a few things, renamed and repackaged as a web language. That means it goes far beyond the capabilities of HTML, giving developers a way to define their own structured data with completely unique tags (e.g., an <ingredients>
tag in a recipe, or an <author>
tag in an article). Over the years, XML has become the backbone of widely used technologies, like RSS and MathML, as well as server-level APIs.
XML was appealing to the maintainers of HTML, a language that was beginning to feel somewhat complete. “When we published HTML 4, the group was then basically closed,” Steve Pemberton, chair of the HTML working group at the time, described the situation. “Six months later, though, when XML was up and running, people came up with the idea that maybe there should be an XML version of HTML.” The merging of HTML and XML became known as XHTML. Within a year, it was the W3C’s main focus.
The first iterations of XHTML, drafted in 1998, were not that different from what already existed in the HTML specifications. The only real difference was that it had stricter rules for authors to follow. But that small constraint opened up new possibilities for the future, and XHTML was initially celebrated. The Web Standards Project issued a press release on the day of its release lauding its capabilities, and developers began to make use of the stricter markup rules required, in line with the work Connolly had already done with Document Type Definitions.
XHTML represented a web with deeper meaning. Data would be owned by the web’s creators. And together, computers and programmers, could create a more connected and understandable web. That meaning was labeled semantics. The Semantic Web would become the W3C’s greatest ambition, and they would chase it for close to a decade.
Subsequent versions of XHTML would introduce even stricter rules, leaning harder into the structure of XML. Released in 2002, the XHTML 2.0 specification became the language’s harbinger. It removed backwards compatibility with older versions of HTML, even as Microsoft’s Internet Explorer — the leading browser by a wide margin at this point — refused to support it. “XHTML 2 was a beautiful specification of philosophical purity that had absolutely no resemblance to the real world,” said Bruce Lawson, an HTML evangelist for Opera at the time.
Rather than uniting standards under a common banner, XHTML, and the refusal of major browsers to fully implement it, threatened the split the web apart permanently. It would take something bold to push web standards in a new direction. But that was still years away.