Skip to main content Go to the homepage
State of the Browser

Web Standards:
Does Anyone Actually Care?

When looking at the current state of the web, and especially under the hood of interfaces and functionality, we may notice that the vast majority of websites don’t validate to W3C standards anymore. But how could all that happen, and with what consequences for SEO, accessibility, and security eventualities ..?

In this presentation, Michael R. Lorek will share insights and realisations on this "vital" topic by shining light on the most common errors in foundational HTML and its semantic challenges.

We will notice that HTML is still an evolving markup language, setting the foundation for all our web pages and web applications that truly need to be solid.

Links
Transcript

[MUSIC PLAYING]

Yes, good morning.

And it's great to see everyone here.

I feel very, very honored to be given the opportunity

to share some thoughts, insights, and realizations

on the state of the web, on the state of web design practice

I will go into.

and we will dig later a bit deeper into its foundation, what actually is HTML.

But first, it might be good to get an idea who is in the room. So if you are a WordPress

developer, please show of hands. Okay, any... What you see, what you get tools, Wix and

Nobody, this is also nice.

What else do we have?

We have enterprise technologies.

Anybody doing that?

Wow.

I'm impressed.

Good.

Design systems.

Anyone works with design systems?

Right.

Hand coders, hand crafters, lovely.

And anybody here already lets the job done by AI.

(audience laughing)

Whoa. (laughs)

This is great.

So last question.

Who of you actually validates your webpages?

Interesting.

Good.

All right, so let's dive into it.

I have to press this button, you know.

Okay, sorry.

All right.

So we came a long way.

As it all started in, actually in 1989,

when Tim Banners Lee had the idea

to build something, set the building blocks for the web, well, he actually released HTML

in 1991.

And then, there'd been two websites really and a bit later, there'd been 3,000.

I came in by 1996, I believe, and it was that HTML, I had a technical background, I did

programming in the 80s, and I've seen HTML, this is fantastic to do your documentation,

and so I went into that and started learning it.

Then we went on and a few years later, it was before 2000, I had the opportunity to

teach it even to 30 to 60 year olds.

Actually I added the teaching element to them because it was then to show how the browser

works but I thought it might be better understanding if they see also what works on the other side,

what works under the hood.

The interesting thing was really that they all knew only word processor and spreadsheets.

Nobody had a background in programming, nothing.

And the most fantastic thing that really struck me was that they started using the alt tag

in the image.

Why tell stories?

So they added the images up and you have a little bit of a mouse and they start telling

stories.

So we said, "Oh, you don't use our tech.

That's quite interesting."

And then everybody was messing around and there'd been no rules in building web pages

at that time.

Till in 2000, Jacob Neusen came up and published usability.

So then we start thinking, "Oh, what is really going on?"

We built on actually, and then we had a bit later came in Jeffrey Zaltman with designing

web standards, but before we built our pages and tables.

So everything changed.

CSS came in.

And then it became very, very tricky where you wanted to do the layout, and it was more

More complicated actually to build a layout with CSS, a two column layout, gives you a

headache and three column layout drove you mental.

And a big relief came a bit later when CSS Grid came out, what made everything nicer.

And also Molly Hochlag did a big thing on CSS design

at the time with a publication on CSS Zen Garden.

So looking up here, we had here 10 websites, 2 million.

Now we are 1.13 billion.

No.

80%, 82% is invisible.

So we have only 200 million websites out there.

It was quite interesting to know.

But all of that, what we didn't came across,

what we built our websites.

What is under the hood?

Well, we just lived in code.

So this was already released in 2006.

It shows the different layers of a web page,

so what's behind it.

So it all starts with a strategy.

You have a scope.

You build a structure, information, architecture,

all of that.

A skeleton, the surface, that's what we see on the display.

And most people-- and you see nowadays businesses,

when they think about websites--

they see only the surface.

It's like the old town when you have the Western movie

and you have the facade and what is behind that.

That was the understanding, and I think this is really great coming across.

I came quite late to see it.

It's just a few years ago.

But understanding UX came into that was 2006, yes.

So web standards in the old days be not really relevant.

We didn't know about that before 2000, that there'd be web standards.

But with this, and we see what we can do,

Web standards became standards.

So I think the new definition by the W3C is really nice.

Web standards are the blueprints or building blocks

of a consistent, harmonious, digital, connected world.

They are implemented in browsers, blogs, search engines,

and other software to power our experience on the web.

It's interconnected, interoperable.

And it is important, while standards, it's same as you

call them, mix imperial with metrics.

So we need standards.

This is important.

So what happened then?

We are here now.

So seriously, do we have standards?

And who actually cares about the standards?

And it was last year when I came across a lady, she was a businesswoman, and she ran

three websites.

She approached me and said, "If I can help her with her websites."

I said, "What's up with them?"

"Yeah, I have a graphic designer and she built everything in Wix.

They don't validate."

She said, "Okay."

And I was ring a bell with me, "Oh, validation, yeah."

Well, I've been running code, I've been coding all the time, it wasn't really my head, validation,

when you understand how to code about syntax and the quality.

So I looked into that and I said, "Okay, Wix is doing that, interesting."

And then I came across, and it was a few months ago, about a company who...

Yeah, they help actually businesses to optimize their website completely.

They monitor everything, they do everything, and now you believe their website is good.

So practice what you preach.

And I go quickly out here and...

Can I do that?

No, I can't.

Ouch.

I can throw it onto the other screen, I believe.

Sorry for that.

It must work, does it?

Yay! It works.

I can't go through that.

But this is the list of...

Ouch, I did a mistake now. I'm very sorry.

Yeah, but we can leave it.

No, I just go back. Sorry for that.

Sorry for that.

Back. How do I go back to this now?

Okay, go back to here. Okay, so sorry for that.

So, then I was looking up earlier this year.

And 2022 of the global top 100 youth,

valid HTML, how many?

Yeah?

How many, you believe?

10?

1, 0, 0.

None.

[LAUGHTER]

So this is HTML.

2021, 98% of the top US websites invalid HTML.

him.

We had CSS for a bit.

Only three websites, 2021, 2022, it improved.

So CSS is on the rise.

And-- oh, step to the back.

Oh, back to back to back.

I need some OTA now.

And back to this.

Sorry.

I wanted to get the other slide here.

I just skip it.

It was just a list.

Now you see it here now.

Sam Wang did nicely the work, put everything of those into a table and you could see, I

I will share this document later,

that they feel in CSS, HTML, or else it's a nice way to do that.

All the top 100 been there.

And then was the other document here.

Where was it?

This is the thing you can't-- ah, here it is.

No.

I can't go there.

Oh, come on.

I'm very sorry for this.

Doo doo doo doo.

Come on back.

There, that's the one.

So, looking into the validation of the other side,

what I cannot access right now,

it was just that this company listed all their

errors and I ran them through and I couldn't believe they would practice what you preach

and 380 lines of warnings, errors. So how can you perform something like this? And it

It makes me really wonder, how can you do that?

If it is one issue, or 50, of a website,

it cannot be half valid.

So in other words, the web is broken.

And it's apparently, there's a global inability

to produce valid HTML output.

How could all of that happen and why?

So let's look at the causes.

We have what you see, what you get tools,

often recommended as free for small and medium enterprises.

And in this country, there are around 99.9% SMEs.

And they've been given the impression,

just go online, build your site,

you get tools like Wix, like Site Builder,

you have Foursquare and all of them lot.

And nobody been told about usability and GDPR compliance.

They think about callers and messing around with

Then we have WordPress, 810 million websites now, 43% is increasing, is going up.

But then the themes rarely validate.

And even when you look up in the database for themes and it tells you, "Yeah, we validate,"

you just check their thing, what they present, and it don't validate.

So why don't they do that?

Then we have third-party snippets or integration

services that often also provide you

with questionable code quality.

Design systems mostly run on React or Vue.js

instead of being architect and CSS.

So what they would react to, they

run the whole website or the web page,

where it's one single page in the back end

and ship it to the browser.

So then in front of my mom's mansion, yeah,

but they run through one and a half million lines of code

and you get through an empty page.

So, and the tricky thing is now,

is the frameworks and we should really imagine

what happened if we wouldn't have the frameworks.

Well, they keep it together.

Yeah, when you start doing this whole framework, ooh.

And then we have enterprise technologies.

No code says so.

It gets a bit out of hand, and it gets a bit complex

diving into these technologies.

And AI, we still not have that it will fix our flawed code,

and I think this is quite amazing

if it would come one day.

So one thing I found really, really important,

and probably the most important,

is inadequate teaching, online or in print.

And I've seen so many teaching,

and free boot camps, yeah, you know, a programmer,

you're a software engineer,

in a boot camp after three months, I'm sorry.

So, but the outreach, the outreach and teaching

when you go online, can be thousands.

And I've seen teaching and I come back to that in a bit.

So, what's been taught?

So why do we need to validate?

What are its benefits?

So search engine optimization is certainly

affected, accessibility, and security.

And that's what nobody really thinks about.

So let's dig into that for a sec.

So search crawlers have a difficulty

to discover your content, but the content is a mess.

There is code surrounding it.

So it needs to be correctly structured semantically.

This, probably best to replace it wherever you can,

especially with semantics.

And for businesses, well, their sites fail.

And everybody wants search engine optimization, all of that.

So why don't you fix your code first

before you invest lots of money and effort

into search engine optimization,

into the architecture of your content.

Then we have accessibility,

where it is influenced by missing or incurring top types,

missing character encodings, unsupported text or attributes,

improper formatted HTML, improper tables,

and improper use of forms.

Security, ooh.

This one gets really nuts and hairy.

So there are doctorate declarations

that have been missing.

Escape characters.

Careful.

And you see it often when you even validate--

it's mostly JavaScript related.

What pops up is that they give you the quotes.

There's nothing in.

You miss a quote.

Bad.

Very bad.

So it gets a bit complicated.

And what I will do, I will add more references

to that in the documents we've got online related

with the talks.

There's one thing really, is the HTTP,

don't load HTTP resources into HTTPS.

It's just, it's a great opportunity

for people to play with your stuff.

So, and, you know, some references here,

but the other thing is,

just pen test your site before you ship it.

You've got a form or something in there,

pen test it and fix it.

People don't think about those things, but it makes it safe, secure for everyone.

So coming back to the teaching.

And as I met quite often people teaching, but sometimes I just look, how do they teach?

Basics and HTML.

HTML is a foundation where we build on.

So the browser is very, very tolerant.

They say, your syntax doesn't really self-closing, mark up.

The browser does it.

You don't have to think too much.

But then you think about the browser, the performance.

But what does it do there under the hood?

No, consistency is key, absolutely,

when you write your code.

giving you some examples where I came across the Death in Problems map, the character set.

And I found it even in books.

They don't tell you that within the first 10, 24 bytes, yeah, it needs to be set.

So it should be prioritized at the top.

On the top, you get the English, yeah, oh, stop, stop, go back.

What was this now?

Okay, just get out of that, okay.

So the language comes first

and then you set the character set before the title.

The title can be longer.

You can have more stuff in there.

And then I've seen, oh, you put the character set,

the definition there.

You can't do that.

Put it on top.

It causes trouble.

Another thing.

Image.

Images.

So often we say, yeah, focus on the alt attribute.

Yeah, it's OK.

Accessibility is very important.

But the browser, if it's not getting the image size,

what does it do?

It tries to calculate it, and you get all these flickering images in a browser, and

it shifts the layout around and all of that.

And the browser doesn't need to calculate, just add the pixels, the height, the width.

And then, there's another thing, well often I've seen nobody uses it really, it's the

title.

It gives you search engine optimization enhancement.

Using more visibility on your page if you do it right.

So inconsistency and contradictions

are things I came across within HTML spec.

So now I ask you, what is right, what is wrong?

Anybody any idea?

What is an acronym?

This is where it gets really bad.

The W3C, it was Microsoft in the old days.

They said, oh yeah, we add the acronym to abbreviation.

Then the W3C, they said, oh, it's

too difficult for developers to distinguish between acronym

and abbreviation.

So they put it as abbreviation.

It's always there.

They remove the acronym.

If you talk nowadays to content strategist,

You talk to journalists, writers, authors and them, they know and they're not happy

with that, that there's only abbreviation.

The thing is...

I'll...

Go back.

Hello.

[laughter]

S1; Sorry for that.

It happens when you use these things.

Okay, sorry.

We're all friendly here, hopefully.

So, I have to press this button.

[laughter]

when you use these things.

Okay, sorry.

We're all friendly here, hopefully.

So I have to press this button here.

Yeah, you see?

Hypertext is one word, not two.

So it's an abbreviation.

An acronym is where everything is that.

It's separate.

So CSS, they both are correct.

But still the acronym is still not part of the spec and it should be, it should be.

Well, you think about people nowadays that work a lot with content and it should be there.

And I'm not feeling happy.

I don't know about you.

Anybody deals with content?

So something should be done actually about that.

So both are correct.

Okay.

There's another thing that gives me a bit of creeps.

When we had recently Corona and everybody went online.

What happened then is you have webinars, you have meetups, you start,

"There's no meetup in Paris, so everybody is online."

And you find Philadelphia meetup talks about you're interested in.

It's across the globe.

What happened now is we have the spec for date and time.

And I notice I have been missing quite a few webinars because

of the time zone.

What they start doing, you get here--

I'll do the right button.

get here the UTC. That's cool. It's good having that. You can calculate from that.

For a normal person, oh, it's that time zone. You look up on the website, when is the time?

In my zone. So we go from PSD, Pacific Standard Time, or PT, Pacific Time. And then we go

Standard Time, or PT, Pacific Time, to BST, or GMT.

Cool.

Now what happens is that in the States, that one changes at a different time as you change

the BST, the GMT.

So sometimes you miss out by an hour.

And then we have, as an example, you know, America or Los Angeles.

It's probably also very difficult for a user or for a web developer to get that right,

get it in.

What do I have to type into that?

America, Los Angeles.

You have to look it up instead of using really the PST.

So it would be probably very desirable having the PST, the GMT, and around the globe, them

also in that standard.

It would make things easier and let the browser calculate that.

OK, moving on.

languages, so often we see...

And I like to look at those pages, these websites, restaurant websites, they're really cool.

Has anybody experienced looking at a menu on a restaurant site?

Mostly you end up straight away with a PDF and not even...

Don't think about the mobile, on the desktop you've got even difficulties to see what do

they serve, what's on the menu.

And don't expect at that moment when you look at an Italian restaurant that they really

give you the nice thing in Italian or in French.

Do they mark it up as a second language?

But it's what you need to do.

So the second language ends up as garbage, whereas that's within English.

And then I had once approached it, I've been working on that to add a second language within

a sentence, within a paragraph.

It took me quite a while to figure that out.

And ah, no, there we go, come on.

Yeah, OK, sorry, wrong button.

So we can start a paragraph.

We have here English.

English is fine. Then we come here and we have a DFL definition. Well it's cool, I

had to abuse it. There is no TransL tag. So I set the language to Sanskrit and

that's what I typed in. So you know you got special characters there and I

I quickly can show you the example how it looks when I have to get over here again.

Can I do that? Yeah, alright, there is the example.

No?

It's not displaying that. Sorry.

This drives me nuts here. Okay.

I have to explain it then.

So you hover it with a mouse.

You hover it, and you get this text comes out.

But the tricky thing is-- and I had a few conversations

with Leonie Watson about that.

It's about a screen reader.

What will the screen reader do at that moment?

And it goes straight at that moment in here

into the other language and takes that as Sanskrit,

as an example.

You hover it with a mouse, with a cursor,

and it gives you this text.

Well, it's good.

I came across some organizations,

and they have thousands of pages where

And they use second language word within the paragraphs.

And they all-- they don't do it.

It's all English.

So all the efforts they do, and they want to get it right,

it fails.

It's just seen as English, and it's just--

it's garbage comes out.

And that's what we don't want.

OK, so code quality.

Code quality is so important.

It's imperative that we get it right

and ensure all your code is clean and also

nicely readable to others.

So when you often look into your code

and you see generated code, it's a mess.

Can you read it?

No.

So become also aware of your code's back-end performance.

What's at the back-end, especially using React?

And you ensure accessibility of your code.

Consider your code's performance in the browser.

And this is a thing if a browser has to calculate it.

As an example, closing tag is very tolerant,

but for a browser, it still needs to calculate

what it runs through.

So it's up to sustainability.

And that's all what I can say about that.

It really adds up.

We need to become aware that everything we do in code,

it gets many times translated until it reaches its decor.

If you compile code, it picks into assembler

and it goes down into macro code, into micro code.

It gets many times translated,

so we need to get aware of that,

that it's not straight there.

So what is quite fantastic right now

is the W3C came out with sustainability guidelines

And part of that is write clean code, please.

Write key takeaways.

Validate all your web pages to the W3C standard.

I think this is-- yeah.

And fix it.

Fix it.

Please fix it.

Warings errors, you created that.

You are responsible.

Pen test your website before shipping.

Become aware of the security rules, yeah.

So educate your peers and website owners, especially on web standards.

I think this is so important.

So it's a professional looking website, validation will tell you.

Thank you.

[applause]

You get that contact details here.

I hope it was...

Thank you.

Good.

[applause]

[music]

Thank you.

Good.

About Michael R. Lorek

Michael R. Lorek

Michael is an online design consultant based in London, UK. With a background in engineering, design, and information technologies, he founded Online Design Ltd. in 2012 and helped various organisations architect and design their online presence by identifying the most appropriate technologies and workflows, resulting in effective and delightful user experiences.

Beside lecturing on internet technologies in further education, Michael also provided regular tech speaker workshops at CodeNode London, supported by Mozilla. Along his career, he already began 40 years ago programming in HEX and became very passionate about code quality and standardized technologies, leading towards a healthy ethos for the web.