[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

gooey stuff

Filed under: Fluendo,Hacking — Thomas @ 21:00

2004-09-15
21:00

Finally managed to transfer UI components on the fly containing more than one file (in our case, a file of code and a glade file). The component on some other machine sends its UI files as a bundle and that bundle gets cached locally based on the md5sum. So I change something in the UI code on machine A, reload the view in the admin client on machine B, and machine B asks for the bundle, realises it has changed, asks for the new files, and caches them locally. Then it executes it.

Net result is that as soon as I click save in the editor, and click in the admin client, I'm running new UI code. Pretty sweet. I'd hate to have to think this up in C.

I finished this piece based on Johan's first draft yesterday during the Mono talk, because I was juggling lots of objects in my head and wanted to flush before I forgot.

Here's the UI for the statistics of the HTTP streamer. I'm happy that I have the HIG to give me direction - I may not agree everywhere, but it sure beats the crap out of having to figure this out all by myself. HIG police, please tell me if I missed a rule or did something wrong in this shot.

Now I need to clean up the code, objectify correctly, and then move on to the next component to add a UI for. And as proof of concept I should test our objectifying by writing an HTTP admin interface as well. Should be easy.

Now I'm off to Boston, so I'll try not to work on the server for a few days. But it'll be hard because it's so much fun...

today is a good day

Filed under: Fluendo — Thomas @ 05:19

2004-09-11
05:19

Exhausted from lack of sleep all week long. For some reason I wake up at 9 every day.
Today we discussed a lot about the user interface of the server we want to use. I think we ended up agreeing on a nice technical approach that will be very good from a UI point of view. It'll be so much fun to implement a technically very complex system of combining UI from all sorts of places into a nice consistent interface. And all code will be sent on the fly over the wire, and cached. Stuff like that is just so easy to do in Python.

Afterwards, Ronald arrived. He seems happy to be working on GStreamer for us, which is great ! I hope we can still get Totem in FC3.

Then, office party. We were still checking Christian's SlashDot submission about our article and applet. It had been pending for over a day. Still no dice.

After that, out for a very nice dinner at the Bestial with the team. I wanted to go home quick to finally get a nice early night to go to bed.

So I browse some stuff, check if the stream is still running, do some more things, and do one final check to see if the firewire is not complaining. Network down. Damn crap router acting up again. Still, decide to wait, because I'm anal-retentive and I really always think that "you might never now if it hit some site and we have traffic". Reload admin page. 200 people. Blink. Check Slashdot. OMG. It's on. Click reload on admin page. 400 people.

Start calling collagues. Do a short dance. 600 people. Server hanging in nicely. Unplug machine from home network that resets the router every 20 minutes so I have good connection for the rest of the night while I do stuff.

Read someone saying on Slashdot that "there is no sound". Damn, remember that I had turned off the sound for the discussions this afternoon. Fire up the user interface for the administration, go to the producing component, and set the volume level to 50%. Sound is fine now.

Call friend in Belgium. "Can I have your bandwidth, some access, and transfer our shiny new server so I can relay ?". Lots of fiddling to try and set up in FC1 ( we need python 2.3). Install mach, set up an FC2 root. Install stuff. Test server. Start server with relay config. New streamer pops up in my UI immediately on my local machine, and I can see how many people are connected. Awesome.

Here's a screenshot I took from the UI I'm using to control the server (Don't worry, just a quick prototype hack, as I said we're still designing the UI ...) You can see that it's handling over 2000 clients, CPU usage on the 800 MHz streaming server is fine, and another relayer (which has a red cross, which is some bug I need to fix) is also streaming to a whole bunch of clients. The main server is maxed out on bandwidth at about 55 Mbit/sec. The second server is getting up to about 700 clients as we speak. I forgot to increase the file descriptor limit on that box, so it wouldn't accept more than 1000, so I had to restart it after tweaking the config.

But, surprisingly (even though I should not say so), the server is holding up great against a Slashdotting.
Wim sums it up nicely:

<wtay> so, the slashdot effects is about 2400 clients.. how lame..

Of course, it's Friday late night, people have gone home, so I expect a steady stream of interest over the weekend. Let's hope the firewire bus doesn't stop working randomly like it sometimes tends to do.

And people seem to like the barebones Java applet a lot as well.

Anyway, four hours later, server still running, I am still desperately lacking sleep, and it's 5 AM. I know I'll regret this tomorrow, but it's been fun. Enough tooting my own horn, back to the weekend.

Sometimes

Filed under: Fluendo — Thomas @ 11:43

2004-09-01
11:43

... Wim scares me. I think he took about four days to port Theora to Java. And it actually runs.

I'm still not going to learn Java though.

Server

Filed under: Fluendo — Thomas @ 16:27

2004-08-18
16:27

Our server is currently serving 1825 real persons. Sweet. So we fixed the big bug from last week and broke through the default file descriptor limit with the help of PollMan (TM). So we have one machine producing in various qualities from various input cards, and overlaying some of them, then compressing using Wim's smoke codec. And then we have another machine far away from the first, relaying the produced streams to a bunch of clients. Time to start on MPEG4.

Christian just arrived with his backpacks, he starts today.

Johan, Wim and Christian are going to Akademy to stream the KDE conference. I hope it works out well for them, but they look prepared. Well, except for Christian :)

I'm not going, because I'm off to a festival, yeeha !

Server

Filed under: Fluendo,Hacking — Thomas @ 22:47

2004-08-09
22:47

So, our server has been tested in production a bunch of times. Each time, it runs fine for fifteen minutes, and clients connect to it all the time. It serves about 500 streams without any issues, only about 1% CPU usage total. At some random point in time however it drops clients and hangs; it looks like it's hanging in a read, but the stack trace seems corrupted.

So the hard thing about this kind of problem is that we cannot trigger it in a local setup where we simulate clients in a dumb way (1000 wget processes, for exmaple), and that on the server it's hard to get usable debug info. The log file with only DEBUG logging from one element on the server is about 600 MB by the time this problem happens. And really, 1000 wget's are no good simulation for a 1000 real users, each with their own network speed, and each with their own "Reload" push frequency.

I've searched the net for network stress test tools, but haven't found anything yet I can use. All of the web stress test tools use the complete request as a testing primitive. Meaning, a successful request is one where you get a complete reply, a full page, and a normal end. Of course are streams are "infinite" so we cannot use these test apps.

Other network testing tools work more low-level, which would mean we'd have to write some TCP/HTTP handling code as well. Really, what we'd need is some tool that allows us to get a URL, specify the client's bandwidth and possibly bandwidth profile, and keep connections alive for a random amount of time. If you know of anything, let me know.

Anyway, I started reading about possible limits for file descriptors and so on, and learned a bunch of useful new stuff. Then I started theorizing about possible failure scenarios from what I had learnt, and then I went through our plugin code again to see if these cases could be triggered. I also thought about how I could test each of these test cases.

The actual bug seems to be a really silly oversight in handling some error cases, but the good thing is I got about ten different points to watch out for and how I could reproduce, test and fix. I can hardly wait to get to work tomorrow to start doing all these tests, because something tells me this will fix our problem and give us a rock solid server. Or, at least, one that runs for more than 15 minutes when faced with a lot of clients :)

« Previous PageNext Page »
picture