Chromium’s multi-process architecture

Chromium’s multi-process architecture


>>Fisher: So this is Chrome
Multi-Process Architecture, seat of the pants edition, and, you know, I just wanna take some time and give everybody sort of
the lay of the land, try to explain,
paint the broad picture of how everything works, and talk about
some of the things that we had to do in order to make make it all work well. And then I wanna
try to leave some time– enough time for people
to ask questions, and hopefully there will be
some interesting questions, and we can dive into
some interesting things. So some of this will be review if you’ve read
some of the design docs, but I’ll just
get right into it. So first of all, what are some of the goals of Multi-Process Architecture? Well,
we wanted to make sure that– What we were basically
going after is, we really wanted to make
a very stable browser, fast and secure, and we recognized that one really great way to improve on all these fronts was to divide up the browser into multiple processes. So when I talk about
stability, though, there’s more than just
dividing the processes up into multiple processes in case the renderer crashes. We also want to make sure
that the renderer can’t do anything to make your browser hang or lockup in interesting ways, and we also wanted to make sure that essentially
the main process that was actually driving
the whole user interface would always be able to show some recent representation
of the web page such that even if the renderer
was out to lunch or running garbage collection or doing some crazy thing, we’d be able to show something meaningful
to the users. And this was sort of
a guiding principle for a lot of things, which I’ll get into later. We wanted to make sure
that all– Related to this, we wanted to make sure
all the communication was very asynchronous, and not blocking in weird ways that would cause
the user interface to–to be janky, and so that’s “stable,” and “fast,” well, we recognized that
by dividing the browser up into multiple processes, we could see benefits
on multi-core systems. That’s one thing. We saw that, you know, if you have many
different applications going, well, they can all
be running independently. Great. But what I wanna talk about
here also is that we found that even in the case
of running a single web page, we could derive some benefits
from multi-core, so I’ll talk about that
in a bit. “Secure.” Well, in the very beginning, we thought, well, in the very beginning, we had no idea we’d be so lucky as to hire a team
of sandboxing experts, but it seemed reasonable that if you
had a separate process for your rendering agent that you could probably
do something to limit the harm if it were ever corrupted. Turns out that, you know, we’re able to bring on the team
of experts in this area, and that led to us
pushing really hard to separate the renderer
as much as possible from the operating system, so that it was easy
to sandbox it, so I’ll talk about
some of that. So here’s some of the goals. First of all, though, I wanna draw another picture just to give you a diagram
of sort of some of the terms. So what we were trying to do
here really is bring WebKit into the browser in this multi-process way, so WebKit represents
a rendering area using something
called a WebView, and so we basically
have this thing that renders a web page that we need to have running in the subprocess, and then get its pixels and put it on the screen
in the main process, so we wrap the WebView with something called
a RenderView, and then this guy talks, as I’ll talk later
over IPC, to something called
a RenderView post, so this is in the browser. And then here’s in
the renderer process. The renderer. [man speaks indistinct] Okay, yeah. Peter made the comment that when we say “view” here, we’re not actually
talking at all about Chrome views, which was actually
developed much later. And to add confusion, there’s actually a class
in WebKit called RenderView, but we’re not–
We don’t mean that. We mean the class in Chrome called RenderView. [laughter] So this is the WebKit side… over here, and essentially
what you’ll see in the code is that there’s a WebView contained by RenderView, which then does IPC plumbing to something in the browser called a RenderView host. So another picture. Okay. Stitching it all together, if you actually get down
and look at the code, what we have in the browser
as I mentioned here, we have these things
called RenderView hosts, which is actually corresponds to a native widget, like an hWnd in Windows, and we can have
multiple of these, one for every tab, okay, and then
on the other renderer side, we have a corresponding
RenderView, again, one for each tab, and these guys, if they’re all corr– As I’ve drawn it here, these are all
in one renderer process, so these are the corresponding
objects in the browser for those guys. And they talk to something called a render process host, who then over IPC is communicating to a corresponding
renderer process. So this is actually– This diagram shows the IPC flow. If a RenderView interview host wants to talk to his peer, he has to send a message through his render process host over an IPC channel to that guy, and so messages between RenderView hosts
in RenderView, we call them routed messages. Just wanna get
all the terminology down. Routed messages. And then messages that just
terminate at the process level, at the render process host or at the render process, we call those control messages. Okay, so as I talk about
some of the other stuff, this hopefully will help. One more thing you’ll see if you’re in the code,
actually, is that RenderView
and RenderView host, they have a base class
called RenderWidget, RenderWidget host. I bring that up because there’s also over here something called a web widget. And conceptually, this divide is simply between things that are purely
just graphics related, and things that add to that web page related stuff, so the concept of loading URLs is at the RenderView level,
WebView level. The concept
of just displaying pixels and dealing with input and scrolling
and cursors and so on is at the widget level. And we also reuse RenderWidget web widget and so on for other types of UI elements that are driven by WebKit that actually don’t
have anything to do with loading pages, which are things like
drop-down selects, so if you have a select menu
in the web page and you click on it, it brings down
a little pop-up window. That pop-up window’s contents are rendered by WebKit, but it’s important to note that that pop-up content area isn’t contained
by the owning RenderView. It can actually extend
beyond the bounds of the frame, so it really needs to be its own top-level hWnd, so we reuse the RenderWidget for that kind of stuff. Okay. So then a little bit about IPC. When we first
started this project, a lot of people were like, “We’ll just use COM. “That’ll solve
all your IPC problems, and it kinda gets you
everything for free.” WebKit already at that time had a COM interface on Windows, so we tried that a little bit. COM had this sort of fundamental
limitation, though, that the IPC was synchronous, or if you tried to do it
asynchronous, it’s not fully asynchronous, and so the performance
wasn’t there. The kinds
of stability properties we were looking for
weren’t there that I talked about earlier. Essentially, if we were
blocked on a renderer from the browser side, if we were blocked
on a renderer, we were hanging the browser, and then we were having bad UI, so we really wanted to make sure we had a very
asynchronous model for IPC. So then we went and just said, “Well, let’s just use a pipe,” and we ended up
using a named pipe, then we used a synchronous I/O over a named pipe. That’s great. Now we can send
asynchronous messages, and what we ended up doing,
though, we ended up meeting,
in some cases, synchronous messages, and so we–for example, we found that there are
plenty of times when WebKit
needs some information, and we need it now, and so that’s a synchronous IPC, where he blocks it
till he gets a result. But we decided
that we will never do blocking IPCs
from the browser side, because of this whole problem of maybe the renderer
won’t respond, and we wanna make sure the browser’s always responsive. Okay. So this sort of segues to talking about threads
a little bit, so I thought
as part of this talk I should give
a little bit of description of the threads in Chrome, ’cause it’s not always obvious. You hear people talk about the I/O thread, the DB thread, the file thread, the, you know,
“What the F” thread, and maybe it would help to know why we have
the threads we have. So in the browser side
over here, we have something
called a UI thread, which is where almost all
the native widgetry runs, okay? And we have an I/O thread, which is where we handle
the IPC traffic, network… network loading, and various other kinds of what I’ll call… routing type events. So this thread tries really hard to never be stopped on anything. It tries to be
purely asynchronous, so then we necessarily meet
a different thread for cases when we need
blocking I/O, and that ended up just
as the evolution of things being called the file thread, and this is where we do
blocking I/O as well as use COM when we need to talk to the OS, or Shell32 calls, because all these things
just are really awful. They can hang your browser
for a long time, so we don’t want to use them on the UI thread
if we can help it. There’s some small exceptions, because we used
a CRichEditCtrl here, which turns out to use COM, and is actually
a huge contributor to why Chrome
might not be as fast as it could be to start up. Ask Peter about that one. Then there’s the DB thread where we used SQLite, and there’s a few other
random threads, but these are the main ones you’ll hear people talk about. Now on this side, on the renderer side, we also have– we have an I/O thread as well, which again, used
the receive IPCs, process IPCs, and then we have something
called the render thread… [person sneezes] …and this is the– this turns out to be
the WebKit main thread, so WebKit is a single thread, except for more recently where it handles worker threads, but traditionally
it’s always been a single-threaded
kind of library, and this is the main thread where all
the WebKit action happens. And so to implement things
like synchronous IPCs, the renderer thread
gets blocked on a me text, but the I/O thread here services the IPCs to this thread here, where then some answer
is provided and back to this guy. There’s one thing that I just
really wanna bring up while we’re talking
about threads, ’cause I think it’s something that is really easy
to get bitten by, which is that when dealing with
synchronous IPCs, like, say you were implementing
some new web API, or you, like–local storage
or session storage, or things of this nature where you need it now, ’cause the JavaScript
application wants a result. It’s tempting to– You start out with this
synchronous IPC, and the traditional way that one might handle an IPC
coming from a RenderView, is to handle it
in the RenderView host, but the RenderView host
is an hWnd, has an hWnd, and it’s actually
living on the UI thread, and so if a synchronous IPC comes from here
all the way to the UI thread, this actually turns out to be
a huge problem for us, and it’s not obvious from what I’ve written
up here why, but plug-ins
are the source of this problem, and you can end up
with deadlocks, and I’ll spend a little bit
of time explaining this, ’cause I think it’s really
helpful for people to know. So in Windows, when you have hWnds– I’m gonna draw pictures
of hWnds here, where the green one
is a plug-in. Like Flash. Flash allocates its own hWnd, and Chrome has its outer hWnd. Well,
when you do this kind of thing, because we’re running plug-ins
out of process, there’s actual
synchronous communication that Windows does
between parent and child hWnds. So it’s possible
for our browser UI thread thanks to Windows, to be blocked, waiting
for Flash to do something, like paint itself
or service an input event. And it turns out
that Flash itself can also do things
like script a page. He can execute script
in the outer page. When he executes script
in the outer page, because we’re running Flash
out of process, what he’s really doing is
sending a synchronous IPC to the renderer, to the render thread, saying, “Please execute
this script for me.” Well, while that’s happening, he’s not gonna be responsive to these incoming
Windows events. And so if the renderer thread were actually at that time trying to communicate
over to the browser UI thread, we could deadlock, because the browser UI thread might be blocked on the plug-in, and you get
these nasty deadlocks, so we just have this
very simple rule in Chrome. Any synchronous IPC
coming from the renderer should terminate
at the I/O thread or one of these other
background threads, but it should never be– never terminate here, never at the UI thread, and then we’re safe. We can allow all kinds
of crazy synchronous IPCs that are doing weird things, and everything’s happy. So then, okay,
now I really wanna move on to some more interesting things. Like painting. Everybody got this? So painting… scrolling… resizing, restoring tabs. All of these things, what they have in common is– these all have a lot in common
which I wanna talk about. So when we– I mentioned very early on, in order to achieve
these nice stability properties, where it looks like the browser can always be responsive
to the user, we were showing a possibly
old representation of the pixels for the page. So every RenderView host has a backing store, which is the bitmap of the last rendered version
of the page, and the pixels for that
come from the RenderView, which has functions like paint, and to produce pixels
and WebKit through this area will call back and do things
like invalidate… a Rect. So we will observe– the RenderView will be observing invalidates
from WebKit, and when he wants to, he can ask WebKit to paint
and produce pixels, and then what he does
is shifts those pixels over to the RenderView host, and the RenderView host puts those pixels
into his backing store, and now we have a representation
stored of the page that we can always
put on the screen, and he gets a WM_PAINT. This is the Windows message
saying, “Hey, look, you gotta put
something on the screen, ’cause I don’t have anything.” And so what we’ll do then is just on the backing store
put it on the screen. So painting in the browser now doesn’t involve
the renderer at all. We can always just paint
from the backing store, and things are fast. [indistinct] from
the backing store is fast. Asynchronous to that, the renderer can update it. Early on, we built this system
and it was working, but I noticed some funny things. Like, if you take Google Maps and you grab the tile– grab the Google Map and you actually try to move it, sometimes it would never see the pixels
on the screen changing, but the CPU usage
would spike way up, and you’re wondering
what’s going on. Turns out that over here we were doing fun invalidate
paint kinds of operations, and getting lots of invalidates, doing lots of painting, shipping bitmaps over, and meanwhile
on the RenderView host side, we were receiving mouse inputs, sending mouse events over, and all this was going on, but we were never
getting a WM_PAINT. Turns out that input events on Windows always trump, always take precedent
over painting, and so that was fun. So we had to then add something to make sure that
we never produced a bitmap unless we got it on screen, so that was some interesting
counting that we just had to do, and then as a result, we don’t have this problem, so we built
an acknowledgment basis so there’s an IPC that says, you know, something, like, I think it’s called Paint, PaintRect or something, and he carries with it some shared memory bitmap containing the data for the invalid region, the newly updated region, and then he has to send back
an acknowledgement. And then based on that everything works much better. Because of most of our– most everybody
was on dual-core at the time. No one ever saw that problem
with Google Maps. It was only when you got on
as the new-core machine. But an acknowledgement-based
system works great there. Scrolling. Scrolling is very similar. The thing you wanna do
with scrolling to get very good performance is you wanna basically– Suppose you have– These are all the pixels
on the screen. What you really wanna do is– Suppose you’re scrolling down. You wanna take the region that’s still to be
on the screen, and just shift
all those pixels down to here, and then back. Fill the exposed region. So scrolling– [man speaks indistinctly] Yeah, he worked
on the scroll work. He’s right. But what’s going on here is that an input event
makes its way to WebKit, WebKit’s like, “Hey.” That hit a scrollbar. Scrollbar did its thing, said, “Move,” it scrolled the page, we get a command here that looks very different
and invalidated. It’s called something like
ScrollRect. It gives us the rect
that should be scrolled, and then DX UI. Then what we do is, we set up
the same kind of thing. We wanna send an IPC
that says, “ScrollRect… passing the same parameters.” We also called paint in order
to generate this region, sending that bitmap
along with the DX UI and so on. RenderView host then says, “Well, I need to perform
the same operations on my backing store.” He takes his backing store, he shifts the pixels down, fills in the exposed region, and now he can, you know, tell the hWnd, “Hey, look, we need to do a scroll
operation,” and then when that
scroll operation happens, he’s reading all the pixels out of his backing store again, so they’re very decoupled. The renderers decouple, painting, actually, the screen, but because
we were actually getting DX UI and the bitmap, we can do things
like use Windows APIs like ScrollDC and ScrollWindow to actually do
optimized scrolling on the hWnd and to give
the graphics system knowledge that we just want to do these shift of pixels
and backfill, which allows scrolling
to be low CPU usage. It turns out
that some pages have, like, exposition content
defeat optimized scrolling, because you can’t just
shift pixels down, you have to re-render stuff because of overlapping things. This is a problem in Gmail, like, with the little
chat windows that come up. That’s why
Gmail scrolling’s slow, because that chat window might– it’s always interfering with the content below it. Anyhow. Resizing. Resizing is kind of interesting, because
what’s happening there is– I’ll just erase part of this. With resizing, what’s happening is that
we have the outer hWnd, user grabs the corner, Windows tells us, “Hey, look, your size changed,” and then right after that
he says, “Now paint.” And this is a case where the backing store’s
not good enough. Our backing store is small, and Windows just– and we we’re–
Now I need new pixels, so what do we do? This is where we ended up sending an IPC
down to the renderer to have him produce
the new pixels for the whole thing, and then we do wait. Then we wait a little bit. Like, I forget how much. Maybe up to 40 milliseconds, or something reasonably small. We wait for our I/O thread to receive an IPC
that actually carries with it the shared memory
for the new rendering, and so if we get it
in not enough time, then what we would actually
be able to display is, unperceptible to the user, we actually
now got the new pixels, and they never see
the old representation at the wrong size. However, I’m sure
anybody who’s used Chrome knows that often times you’ll get
this little white border here where we didn’t have pixels yet, and we’re just
still catching up. And mostly that’s limited
by how fast WebKit can relay out the whole page
as a result of a recess. So, Gmail,
this will always happen. Google.com,
it’ll never happen. Anyways, Google.com
has a white background. So turns out that because
we’re using these backing stores to get good performance
of the browser, we’re also
using a lot of memory, and so if we have
a lot of tabs. we’ll be using a lot of memory ’cause we have
a ton of backing stores, and that’s kinda bad, so we need a cache, obviously. We have only
so many backing stores. So then what about tabs that
you haven’t been to in a while, and you go back to them, what are you gonna do? Well, it’s kinda similar
to the resize problem. You don’t have a backing store,
but you need it now, so we used the same trick– asked the renderer
to produce pixels, wait just a little bit, see if he can produce them
in enough time. If he does,
then it’s all seamless, and we actually
ended up extending this. We said that
even foreground tabs, ’cause you have
a lot of windows, even foreground tabs can potentially lose
their backing store, and if the user interacts
with that page now, we have to go get
the backing store and fill it again, and the same trick
is used there. It all kinda works. I said I would save time
for questions, so I’m gonna just end there, and ask if people
have questions. Is there anything
you want me to talk about, or should I go on? Five minutes?
I’ll just go on. Okay, sure. So I mentioned a little bit in the beginning about events. Turns out that we– Or actually,
I won’t talk about that. I’ll talk about sandboxing
a little bit more, and some of the security steps, some of the stuff we did
to facilitate security. So as I mentioned with plug-ins, hWnds can be multi-process. You can have a different hWnd
for a child process, so why not just use hWnds
for the renderer? Why do all this crazy stuff I just got done describing? Well, turns out that we just
lose the ability to sandbox it as effectively, because there’s
these connections between child and parent hWnd, it’d be possible
for the renderer, a corrupt renderer, to mess with
the window hierarchy, and thereby screw up the browser in interesting ways. Also, hWnds like this would reveal information
about the desktop, and one of the things
that the sandbox achieved was literally
running the renderer with a different desktop, so it’s running on its own
virtual desktop, so it’s such
that if there were ever any kind of hole in the sandbox that allowed people to play
with user32 libraries, all they’d be getting is user resources
for a different desktop. And so this adds
another level of security, just to kind of keep user input and other kinds of things away from this renderer, as far away as possible. Right. So this really necessitated, getting all the Windowsisms
out of the renderer, and moving to this sort of more in-memory kind of approach.>>Peter:
Isn’t there one other reason to not use hWnds for renderers, which is to not interlock
the message views, and thus possibly
hang our browser if the renderer hangs up?>>Fisher:
Yes, what Peter said. With all the synchronist
IPCs to happen between hWnds in Windows, we have a potential problem that the child window
is blocked, waiting on its parent
for something, and I explained earlier that whenever
we have a situation where the renderer thread is blocked, potentially,
on the UI thread of Chrome, bad things can happen. So this actually,
for a number of reasons, turned out to be very necessary to just get the hWnds
out of the rendering process.>>Peter: Which means
we can still lock the browser to rely on hung plug-ins, assessing a hung
plug-in detector, right?>>Darin:
Yes, so if anybody’s seen– If you wait long enough
after Flash locks up, if you wait 30 seconds,
I think, you’ll see
a little dialogue come up offering to kill Flash, because we’ve detected
that it’s not responsive, and that it’s
brought down or wedged the whole widget hierarchy
it’s associated with. We use the same kinda API that Windows does to figure out that it should put up
the little “end task” dialogue for unresponsive treatment in the title bar
of your application. Any other questions? Okay, then I’ll just throw out
one more interesting thing. Turns out because
we’re multi-process– Ojan reminded me of this. Turns out because
we’re multi-process, we can do things like potentially just kill, terminate the child process when we close a tab. If the child process
has no unload handlers, if the web pages in there have no unload handlers and no [indistinct]
load handlers, then the web page anyways
has no idea if it’s gone, so we look to see did the web page have any of those kinds
of event handlers, and if it doesn’t, then when we wanna kill it, we just alt terminate process, and turns out to be really nice, ’cause then tabs close quickly. Okay. End. [applause]


5 thoughts on “Chromium’s multi-process architecture

  1. Please fix the audio, it is almost unhearable (do a compression on it so that volume is constant, it doesn't fade in and out).

Leave a Reply

Your email address will not be published. Required fields are marked *