A Blog by Jonathan Low

 

Mar 28, 2019

How Google Warped Hyperlinks But the EU's New Copyright Law Might Change That

Hyperlinks were originally intended to have the modest goal of facilitating academic research. That was definitely then.

Google perceived a much broader opportunity, the reality of which has changed many lives, eviscerated entire industries and professions, as well as upended centuries of communication and discourse.

The European Union has now voted to impose rules governing the hyperlinking of content, which, if they are upheld, may make the internet a bit slower and more expensive (while arguably safer and fairer) - or simply make big providers like Google and Facebook microscopically less profitable should they sensibly choose to absorb the added costs. JL


Sophie Charara reports in Wired and Kelsey Sutton reports in Ad Week:

“Since Google came on the scene, links moved to being a battleground. Google’s core insight was that you could treat every link as a vote for the site.” Hyperlinks are doing more work than originally intended and the core of links has been altered. The (EU's new) rules are intended to give copyright holders/content creators, more control over how their content is used and more ways to earn money from content. (It) will require web platforms and aggregation services to pay publishers when they aggregate portions of their content in search results. (It also) holds for-profit platforms accountable for copyright-infringing content that appears on their sites.
Wired The fourth doctor, Tom Baker, stands on a scrap heap of CRTs, dressed in a tux and red cumberbund, addressing a bemused Douglas Adams. It’s the opening of the 1994 BBC documentary Hyperland and Baker, a software agent, is trying to coax Adams into going on a “ramble” through the internet.
Baker chaperones Adams as he jumps around various media, clicking hyperlinks to access everything from Robert Abel’s educational tool on Guernica to a live feed of the Atlantic Ocean. This intellectual ramble, which seems wonderfully naïve decades later, has its roots in proto-web information science concepts such as the “associative trails” of discovery through automated, cross-referenced documents that Vannevar Bush, head of the US Office of Scientific Research and Development and later a major influence on internet pioneers, outlined in 1945.
You might think of the hyperlink as a relatively recent invention, but, at least conceptually, it’s not. But more than 70 years later, it’s warped beyond all recognition from what was first proposed. In the 1960s Ted Nelson introduced the concepts of hypertext and hyperlinking between text and media, proclaiming “everything is deeply intertwingled” in 1974, and, to race through history, a succession of pioneering, local, collaborative systems including Doug Engelbart’s NLS, HyperTIES, Microcosm and Brown University’s Intermedia, followed.
Tim Berners-Lee cited Ben Schneiderman and Dan Ostroff’s HyperTIES, the first electronic journal, as the inspiration behind the link concept for his WorldWideWeb proposal at CERN in 1989. The original impetus behind Berners-Lee’s work? To develop a system for automated information sharing across universities and research institutions.
“The early hypertext pioneers envisioned systems geared primarily for scholarly research and knowledge production; for the most part they did not envision the kind of commercial, consumer-oriented environment that has come to characterise the modern internet,” says Alex Wright, lecturer at the School of Visual Arts and author of Glut: Mastering Information Through The Ages. “In certain crucial respects the web is still a much more limited version of what Bush, Nelson, Engelbart, and others had in mind. Hyperlinks still only work in one direction; it’s all but impossible to follow user ‘trails’ among documents in the way that Bush envisioned; and there’s no single, universal mechanism for managing your identity across platforms.”
If the link is the connecting studs of a LEGO brick, in this sense it’s far from broken. Search engines aside, links are shared in WhatsApp groups, emails, on Twitter and Google Drive and you still browse the web in homage to the innocent, curious, proto and early web thinkers, every time you get lost down a rabbit hole. The web is global, not simply a local connected library. And yet Google, Apple, Facebook and Amazon have skewed the original ambitions for hyperlinks, who they are for and how far they can lead you.
The impact that Google’s PageRank algorithms have had on how the commercial web chooses to deploy hyperlinks can be seen in just about any SEO (search engine optimisation) blog. Publishers and businesses are encouraged to prioritise internal links over external links that may boost the competition in Google’s rankings.
“Since the very moment Google came on the scene, links moved from being the defining characteristic of the web, to being a battleground. Google’s core insight was that you could treat every link as, essentially, a vote for the site,” says Adam Tinworth, a digital publishing strategist. Tinworth explains that Google tries to minimise the effect of these ‘unnatural linking patterns’, which includes comment spam and ‘guest posts’, but it remains part of “how the shadier side of the SEO industry operates”.
With clear, financial incentives to serve Google’s web spiders, which regularly ‘crawl’ website content to determine its placement in searches, a common strategy involves placing hyperlinks on specific ‘anchor text’ - the actual words that you click on - that benefit that site’s PageRank for keywords rather than tailor links to readers. That’s not inherently a problem but research from the University of Southampton, published in February, suggests it doesn’t go unnoticed.
Researchers used eye-tracking tech on 30 participants to find out how hyperlinks affect human readers’ experience of a web page. Confirming pre-web research on signalling theory, they found that people reading passages of text containing blue, underlined hyperlinks, or simply blue words, were more likely to re-read sentences when uncommon words were linked and therefore highlighted. (Berners-Lee doesn’t remember who decided on the standard blue, underlined hyperlinks though early browsers like Mosaic undoubtedly popularised them.)
“What does your brain do when you’re looking at a blue word and a bunch of black words?” says Gemma Fitzsimmons, a web science researcher who led the study. “The main thing is that when you have a blue or bold word on its own and it’s the only unique thing that stands out, everyone thinks, ‘I need to look at that, it might be important.’ The less hyperlinks you have, the more important they seem.” If hyperlinks were completely geared towards human readers of texts, they’d point towards relevant, contextual information using anchor text that contains the most important points on the page.
In the cases of Apple and Facebook, the question isn’t so much how we link and how we react to them, as where we can link to and where we can follow links to. Facebook’s Instant Articles, Google AMP (and indeed apps like Apple News) all propose variations on limited systems of linking back to sources of information. As for Instagram, it’s based on a two-tier system: users can’t add external links to posts (#linkinbio) unless they buy adverts whereas accounts with a large number of followers are able to add external links to Stories.
Facebook’s linking rules aren’t as explicitly limited but its news feed algorithm encourages people to pay to boost posts with external links that take users out of its network. As computer scientist and internet pioneer Dame Wendy Hall puts it, “Facebook would love us to just live inside of Facebook.” Truly surfing the entire web, hopping from link to link, hasn’t made business sense, it seems, hence the silos. If working today, Vannevar Bush, whose Memex concept was concerned with books and microfilm, might effectively consider Instagram as one book, Facebook as another book, with information inside in need of liberation via associative trails (links).
Hall describes changes in linking practices as a “side effect”, not a cause, of the fracturing of the internet based on the key issue of who controls data. So perhaps shifting trends in hyperlinking can at best be seen as symptoms of wider concerns around the web at 30.
“The whole phenomenon of fake news and information silos has been facilitated by platforms like Facebook taking control of content, even preventing outbound links by displaying content within the app, and feeding you content they think you want to see,” says Belinda Barnet, senior lecturer in media at Swinburne University of Technology and author of Memory Machines: The Evolution of Hypertext. “To free us from these ‘echo chambers’, we need to open those platforms up or at least make their workings more transparent. It’s not enough just to change the hyperlinks we need to crack the whole thing open.”
In tackling what’s broken in online news and information, there’s a renewed focus on provenance. “The basic ethics of linking are simple,” says Adam Tinworth. “If you sourced information or content from elsewhere, link to that source. It’s what the web was built for, and it’s just good manners.” Where things get “murky”, though, are instances where media outlets are considering linking to problematic or dangerous content in which case ‘no follow’ links can be used to signal to Google that it should ignore that site for ranking purposes.
Returning to PageRank, in 2016 Google advised that social media influencers use no-follow links when including affiliate links on their posts, in which they receive a product sales commission in return for freebies, gifts and sponsorship. With Amazon’s international affiliate scheme, in particular, offering a revenue stream to social media creators and media organisations, money is now flowing directly through hyperlinks, tagged to the influencer or publisher, with the onus on the creator of the links to make that fact known to readers, viewers and subscribers. (WIRED is a member of affiliate schemes).
It’s unlikely that the hyperlink will get a makeover in the near future because a large part of the web’s success lies in its simplicity - attempts at backlinks, labels and hovering windows tend to feel cluttered. If hyperlinks were to be given an update, though, there’s no shortage of suggestions from thinkers and computer scientists who began working before the web.
Berners-Lee’s links were designed to be one-way, unlike the two way hyperlinks first suggested by Ted Nelson. In Nelson’s ongoing Project Xanadu, “links must be bivisible and bifollowable,” says Barnet, “capable of being seen and followed from the destination document as well as the originating document.” And Nelson went further, with a method of citing text or media by integrating parts of the original into the first ‘window’ or ‘document’, something he refers to as a “parallel presentation” in Werner Herzog’s documentary Lo and Behold: Reveries of the Connected World. These were hyperlinked so that the user could click through to get the full context with a mechanism for micropayments to the original author.
“The system we were working on at Southampton Microcosm [the pre-web hypermedia system developed in the 1980s] had very sophisticated two way linking,” says Dame Wendy Hall, professor of computer science at the University of Southampton. “It was very prescient of the Semantic Web – you used the links to describe why you were making that relationship between those two data objects.”
The Semantic Web, first proposed by Berners-Lee in 1994, moves away from 20th-century ideas of interlinked documents and pages towards a web of data, that can be processed by machines.. Barnet similarly suggests a “more intelligent linking system”, something she says the Semantic Web is seeking to rectify: “Is there a more efficient way of linking that could identify content by what it is, not just where it is?”
Then there’s the question of whether the many hidden functions of hyperlinks could be made more transparent. Designer Ted Hunt, a resident at Somerset House Studios, suggests an alternate timeline in which Nelson’s nonlinear links influence everything from the authority of information online to copyright. In 2016, Hunt took Nelson’s concepts and combined them with the +/- classifications of Paul Otlet, an earlier influential thinker on information science, working in the 1930s, about more sophisticated relationships between information.
In a one-day project, he created a speculative system of classifying links. A double underline indicates a citation of a source document and dash, dot and wave underlines signal agreement, disagreement and other relationships. “Otlet proposed ideas about how information has its own social world,” Hunt says. “You could relate documents that disagreed with each other, or cited each other or built on each other and XYZ.”
It’s an interesting experiment into the user interface of hyperlinks but, says Hall, this kind of functionality can’t be retrofitted into what we have now. “If you’ve got a way of building links between data objects, as [Berners-Lee] proposed in the Semantic Web, or the way we were doing it, then of course you build in about whether you like or dislike something. But you can’t build it into the vanilla web because they’re static links embedded in documents.” Alex Wright sees linked data movements as “in some ways harkening back to Paul Otlet’s vision of a more organised, structured networked information environment.”Hyperlinks are doing a lot more work than was originally intended and at the same time, the core component of links between ‘pages’ and ‘documents’, not just within them, has been altered. Barnet points out that as wholesome and pure as early, small-scale visions seem now, “you can’t just meander about and find what you need, any more than you can wander around the Library of Congress and arrive at the exact document you need.”
Ad Week After the European Parliament approved the EU Copyright Directive in a 348-274 vote, the controversial and complicated overhaul of copyright rules in the European Union will cause major ripple effects across the web, if not change the internet for good. The new law is intended to update existing European copyright rules for the internet age, but tech companies and free speech advocates have voiced concern that the directive will negatively limit the way content on the internet is moderated and disseminated online, chilling free speech, making digital content harder to find and disenfranchising smaller businesses who can’t afford to comply with the rules.
Advocates of the directive, though, argue the requirements will give publishers and content creators more control over their work and allow them to reap the financial rewards of their content when it’s disseminated across the internet.
Ongoing discussion and protests around the bill’s passage have been contentious, and a Change.org petition opposing the directive has collected more than 5 million signatures as of today.
Policy experts expect the rules will have consequences around the world.
“The EU is, for better or for worse, a global standard-setter on internet policy,” said Raegan MacDonald, the head of EU public policy for the open-source software company Mozilla, which opposes the new rules. “ … I don’t think it will be long until it comes to the U.S., and the way the EU and the U.S. have been looking at the tech-lash and examining the responsibility of platforms, the discussions are actually quite similar.”

What the directive is

The EU Copyright Directive is a series of rules aimed at updating copyright regulation to account for the digital age. In broad terms, the rules are intended to give copyright holders, like publishers and content creators, more control over how their content is used across the internet and provide more ways to earn money from their content. It’ll also expand the ways some content can be used for educational and cultural purposes. Two portions of the directive in particular—Article 11 and Article 17, previously numbered Article 13—have drawn the greatest deal of attention and scrutiny. Article 11 will require web platforms and news aggregation services to pay publishers when they aggregate small portions of their content in search results, a policy that’s been referred to as a “link tax.”
The other, Article 17, holds for-profit platforms accountable for copyright-infringing content that appears on their sites. In practice, it will likely mean companies would have to implement a screening process before content can live on their platforms, much like YouTube’s Content ID upload filter, which checks for copyrighted music and other assets prior to upload.
Legislators supportive of the articles say that the law will make the internet fairer by ensuring that content creators are compensated when their content is used and aggregated, and by requiring tech platforms to ensure that content they profit from is not infringing copyright. People opposing the directive are much less optimistic, saying the bill will limit the spread of information, chill free speech and prevent small platforms from competing with the established tech giants.

What publishers think

Media associations and groups that represent big publishers operating in the EU have expressed support for the finalized text, and are particularly enthusiastic about Article 11, which they say will allow them to get more money from search results. More than 270 organizations comprising groups representing news publishers, photographers and music publishers advocated for the rules, and in a joint statement called the bill a “historical opportunity” to help build “an internet that is fair and sustainable for all.”
Not every publisher is on board. A study prepared for Parliament, which concluded that Article 11 would not help fund the media industry as intended, found that many journalists opposed the proposal.
Allison Davenport, a technology law and policy fellow at the Wikimedia Foundation, said that for platforms like Wikipedia, the link tax will affect contributors’ abilities to find information for Wikipedia and will ultimately shrink “the depth, accuracy and quality of Wikipedia’s content.” The Asturian, Catalan, Galician and Italian versions of Wikipedia blacked out the sites today in protest of the directive.
Similar link taxes have been shown to have negative effects on publishers elsewhere; in Spain, for instance, a similar legislation requiring Google pay all publishers for snippets included in its news aggregation service that Google News pulled out of the country entirely, leading to a dramatic decline in traffic to Spanish newspapers.
After a similar bill passed in Germany, Google simply refused to pay publishers to include their content and waited for publishers to waive the legally-imposed fees. The German publisher Axel Springer tried to hold its own against Google, but after suffering a 40-percent drop in traffic, caved, and let Google include snippets of its news content without paying.
It’s unclear what exactly the rules might mean for U.S. publishers, who last year told Adweek they were uncertain about whether the rules could hurt or help U.S. publishers not beholden to the same rules.

What platforms think

Big tech platforms have come out against the bill. In a company blog post published in early March, Google svp of global affairs Kent Walker said the directive “creates vague, untested requirements, which are likely to result in online services over-blocking content to limit legal risk.” Google has also expressed opposition to Article 11, which would require web platforms pay publishers for aggregating portions of their content, including photos and snippets of articles. In a statement today, a spokesperson for Google said the law will be bad for businesses.
“The Copyright Directive is improved but will still lead to legal uncertainty and will hurt Europe’s creative and digital economies,” the spokesperson said. “The details matter, and we look forward to working with policy makers, publishers, creators and rights holders as EU member states move to implement these new rules.”
Facebook, which has previously voiced opposition to parts of the directive, did not immediately respond to a request for comment.
Smaller platforms that might not be able to shoulder the cost of creating a content-filter or afford the legal risks of hosting content could face even tougher barriers to entry as a result of the bill, MacDonald said in an interview with Adweek prior to the directive’s final approval. She said she feared that the rules will further entrench the power of the big internet platforms because those companies will have the resources to manage it.
“If you are not YouTube or Facebook, which has its own tailored tech, and if you are not one of these platforms with massive resources, you might just not take the risk,” MacDonald said. “We are concerned that we will see much fewer open platforms because of this.”
The Electronic Frontier Foundation, along with other digital rights groups and free speech advocates, have warned that platforms’ content-filtering processes could chill free speech and creativity online, especially when those processes often happen automatically and without human review.
“Machines are good at many things—making the final determination on your rights isn’t one of them,” EFF activism director Elliot Harmon wrote in a February blog post about why the group opposes upload filters that the copyright directive would ultimately require.

What happens next

It’s likely that the final text of the bill will be approved by legislators later this month. Barring even more pressure on legislators and a reversal of an EU country’s stance on the bill, the directive will get a final rubber-stamp, and countries in the EU will have until 2021 to implement laws of their own that match the directive. What that will mean in practice, though, is unclear—primarily because the text of the directive leaves lots of room for legal interpretation.
“I think the legal uncertainty that will be created is really overwhelming for even large companies,” MacDonald said.
It’s not just the internet in the EU that stands to change dramatically once the bill is implemented. If the industry learned anything from the passage of the EU’s General Data Protection Regulation, it’s that what happens in the EU doesn’t stay in the EU. Free speech advocates, small publishers and platforms say they will continue to ring the alarm bell hoping to stop other legislative bodies from adopting similar rules.
“The EU has squandered the opportunity of a generation,” MacDonald said, “… and the biggest loser out of this whole reform will be individual users.”

0 comments:

Post a Comment