use and abuse of pipes with audio data

Posted by Anonymous on Tue 24 May 2005 at 12:17

Tags: ,
What ho! Get on board! Let me take you on a voyage of discovery on the vast ocean of GNU/Linux. We will venture forth and study some nifty tools, including the curious thing known as a fifo.

fifo file fun!

What is a fifo ? Well, it's a special sort of file. The name stands for "file in, file out", indicating that the file goes in and out. It is sometimes also called a named pipe.

Why is that?

Well, suppose I type

mpg123 -s *.mp3 | rawplay
This is a nice simple one-liner. It gets the mp3 files converted into a stream of raw audio and plays the raw audio with rawplay ("apt-get install rawrec" will install rawplay).

Looking at it a bit closer, I could say I am piping the output of mpg123 to a temporary file which I do not see (symbolised by the | sign) and then running that temporary file past rawplay. I could say the temporary file is hidden from sight. I could say all that, but I'd be wrong actually, because it is not a temporary file - it is a buffer - and data continuously streams via this buffer.

What's the difference between a file and a buffer?

Well, let me uglify the one-liner by bringing a temporary file into sight. Let me name the temporary file explicitly. I could do:

mpg123 -s *.mp3 > tempfile
cat tempfile | rawplay
rm tempfile

and get the same result as our one-liner. But this is now a multistep process, and would probably involve an awful lot of diskswapping as tempfile is created in its full glory, and only then streamed into rawplay. Not at all nice and continuous compared to the one-liner buffer version.

There is another way of making it a multistep process though. I can make an equivalent of the one-liner buffer version, without the diskswapping overhead, but still using a temporary handle, like this:

mkfifo myfifo
mpg123 -s *.mp3 > myfifo
cat myfifo | rawplay
rm myfifo

Oh yes. I sneaked the fifo thing in here. It is called a "named pipe" because it is a pipe type of operation being done and the pipe has been explicitly named. In this case it has been called myfifo. myfifo does the file in/file out bit, and it differs from a normal file (such as tempfile earlier) because it has a file size of 0. Not surprisingly, perhaps, because it is a buffer, sending data out as it gets it.

That's nifty, But what's it good for? Why should I even bother with a multistep fifo process using this weirdo stuff when I have the one-liner?

Well, now I have an explicit handle on the step in the middle, a handle that takes up no diskspace, and it lets me do things on-the-fly to the stream.

Fifo? A handle that lets me do things on the fly?

Yes. Some more examples are probably a good idea now:

I will use the netcat utility to illustrate things. netcat (or nc) is a must-have tool that allows the machine to send and listen to stuff on ports. (We covered an introduction to netcat previously - but if you don't have it "apt-get install netcat" will install it).

Netcat is like a telnet on steroids. Eg: suppose I have a smtp server, which I will unimaginatively call smtpserver, allowing access on the standard port 25. Then:

nc smtpserver 25

will be pretty much the same as

telnet smtpserver 25

See the man page for netcat for more details.

Now, suppose I have this tinpot laptop with horrible shrill speakers, and a multimedia desktop machine with StupendaSound speakers, (you know the kind I mean, the ones with an eyepopping, earbleeding bass). They are connected via LAN. I don't want to hear the stream on my tinny sounding laptop - I want to hear it in its full ghettoblasting glory on the multimedia desktop.

So on tinpot I type:

tinpot:~$ mkfifo myfifo 
tinpot:~$ mpg123 -s *.mp3 > myfifo

Ie: I make the fifo, and stream the raw audio to it.

On ghettoblaster I type:

ghettoblaster:~$ nc -l -p 2345 | rawplay

Here, netcat is listening on port 2345, and piping anything it hears to rawplay. There is no sound being heard yet.

Then back on tinpot I type:

tinpot:~$ cat myfifo  | nc ghettoblaster 2345

Ie, I am taking the myfifo buffer on tinpot and forwarding it to netcat, which sends it out to ghettoblaster's port 2345.

The moment I type in the last line, the sound starts up (think of it as a valve in a plumbing pipe system - until the myfifo valve is released, nothing flows). Of course, this would be a terrible way to do distribute the stream over the internet, since it is uncompressed. But on a LAN with loads of bandwidth this is just dandy.

Why use fifo tricks at all? After all, I could use just these two lines, and do away with myfifo entirely:

$ghettoblaster:~$ nc -l -p 2345 | rawplay
$tinpot:~$ mpg123 -s *.mp3 | nc ghettoblaster 2345 

True, and that works splendidly for two machines if I want that only one of them play the sound. But suppose I have a third machine in the lounge and I want to play the stream there as well, at the same time? Try as I may, there is no way of getting sound out on more than one machine on-the-fly without installing a full distributed sound application - unless I use fifo, the magic handle in the middle. You see, once you have a handle, you can use all the usual trickery with pipes, redirection, and, most importantly in our case, tee.

tee for two (or more)

Let's get really adventurous, and pipe the raw sound data not just to two, or even three machines. No indeed. Let's send it to 6 machines all over the house. It makes the structure of things a bit clearer actually. The set up then goes like this:

On the machine that is the source of the sound, (tinpot) I do:

tinpot:~$ mkfifo myfifo1 myfifo2 myfifo3 .... myfifo6
tinpot:~$ mpg123 -s *.mp3 > myfifo1
Then I do the following on the machines indicated.

machine1:~$ nc -l -p 2345 | rawplay
tinpot:~$ cat myfifo1 | tee myfifo2 | nc machine1 2345

machine2:~$ nc -l -p 2345 | rawplay
tinpot:~$ cat myfifo2 | tee myfifo3 | nc machine2 2345

...

machine6:~$ nc -l -p 2345 | rawplay
tinpot:~$ cat myfifo6 | nc machine6 2345

The moment I release the valve (ie type "cat myfifo8 | nc machine6 2345" on the last machine), all 6 machines start playing. (Remember, tee pipes stuff to STDOUT as well as to a file (in this case a fifo). It functions like an audio splitter in this case).

So there I am, with music blaring more or less in sync throughout the house, and a big geeky grin on my face. For I am now a Master of fifo, tee, and netcat! With my newfound nerd bravado, I venture forth to the next adventure in GNU/Linux geekdom:

natty new nettee net utility - tee for fifo

So what if the gobbledygook title of this subsection doesn't even sound like English? Think of it as me being incoherent with excitement over David Mathog's rather spiffy nettee.

nettee is a clever new utility that sort of combines netcat with tee, and so does away with fifos. Grab it and compile it off the site (there is no debian package for it to date (24 May 2005) that I am aware of, but that may have changed by the time you read this - it seems to only have been around for about a month so far).

Install nettee on all the machines, and run it on each like this:

machine1:~$ ./nettee | rawplay
machine2:~$ ./nettee | rawplay
...
machine6:~$ ./nettee | rawplay

so now each of these is listening on the nettee port (9997 by default).

Then on the source machine you type in:

tinpot:~$ mpg123 -s *.mp3 | ./nettee -in - -next machine1,machine2,machine3, ... ,machine6
That's it!

The walls are a-quaking now as I blast out "WE WILL ROCK YOU!!!" at full annoy-my-neighbours level. Not just on one machine, but simultaneously now on all the machines in the house. The plaster is falling down, and the irritating thumps of my neighbour's broom are drowned out by the boneshaking ear-crushing sound. I thump my desk frenziedly along with the music in the primal excitement of the moment, feeling deliriously self-satisfied and smug, I bellow out aloud: "Yep, nettee really, literally, ROCKS!!!"

Epilogue

And as this article at last comes back from its voyage of discovery on the endless ocean of GNU/Linux, coming back to rest on terra firma once again, I say unto you have have so bravely gone with me: go forth, young geek, and discover new ways to abuse these tools, and spread the good word! Yea, for verily you are one of the Chosen ones!

PJ

 

 


Posted by Anonymous (85.194.xx.xx) on Tue 24 May 2005 at 16:55
> The name stands for "file in, file out"

Nah, it stands for "first in, first out".

[ Parent | Reply to this comment ]

Posted by Anonymous (195.212.xx.xx) on Tue 24 May 2005 at 17:46
nah, everybody knows it stands for "fish in, fish out" ;)

[ Parent | Reply to this comment ]

Posted by Anonymous (46.115.xx.xx) on Tue 20 Aug 2013 at 08:16
fool in, fool out!

[ Parent | Reply to this comment ]

Posted by bignose (150.101.xx.xx) on Wed 25 May 2005 at 01:46
The "named pipe" is called a FIFO because it implements a FIFO data structure in the filesystem. The F letters have nothing to do with "file". http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=fifo

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Wed 25 May 2005 at 03:51
[About FIFO actually standing for first in, first out.]

Yes, you're right. Thanks to bignose for the explanatory link. I wondered about the word "file" actually, which is why I worked around it so much in the article.

PJ

[ Parent | Reply to this comment ]

Posted by whizse (62.209.xx.xx) on Tue 24 May 2005 at 20:00
Very interesting article! :-)

[ Parent | Reply to this comment ]

Posted by Anonymous (198.54.xx.xx) on Tue 24 May 2005 at 21:49
Rather than the "uglified" one-liner being:
mpg123 -s *.mp3 > tempfile
cat tempfile | rawplay
rm tempfile
wouldn't:

mpg123 -s *.mp3 > tempfile
rawplay < tempfile
rm tempfile

have put the point across better, or am I just a pedantic twat?

[ Parent | Reply to this comment ]

Posted by forrest (208.42.xx.xx) on Wed 25 May 2005 at 04:17
[ View Weblogs ]
I agree with what Jari Aalto about the "useless use of cat": yes it's an extra process, but jeez, it reads from left to right that way, so it's easier to understand.


Your post makes me wonder, though: do fifos need "cat", or can you say

rawplay < myfifo
? Inquiring minds want to know.

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Wed 25 May 2005 at 05:35
Er... try it and see?

Yes,

rawplay < myfifo
will work just fine.

PJ

[ Parent | Reply to this comment ]

Posted by Anonymous (150.216.xx.xx) on Thu 2 Jun 2005 at 21:55
You can eliminate the UUoC while maintaining your left-to-right order:

< myfifo rawplay

[ Parent | Reply to this comment ]

Posted by Anonymous (213.93.xx.xx) on Sun 5 Jun 2005 at 13:35
< myfifo rawplay

eek!

<outfile> filterprog
</outfile> filterprog

[ Parent | Reply to this comment ]

Posted by Anonymous (213.93.xx.xx) on Sun 5 Jun 2005 at 13:36
never mind, forum software ate my post, thanks forum software, i think i'll not repeat the post.

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Wed 25 May 2005 at 05:06
Rather than the "uglified" one-liner being:
mpg123 -s *.mp3 > tempfile
cat tempfile | rawplay
rm tempfile
wouldn't:
mpg123 -s *.mp3 > tempfile
rawplay < tempfile
rm tempfile
have put the point across better, or am I just a pedantic twat?

Absolutely.

I wanted the map of the cat to go over well with the other examples in the article.

(confession: I am also perhaps a bit of a lazy physicist [1], so while I am vaguely aware of the "<" operator, just catting things is simpler to my thought processes).

PJ

[1] From the book "Surely You're Joking Mr Feynman!"

The next paper selected for me was by Adrian and Bronk. They
demonstrated that nerve impulses were sharp, single-pulse
phenomena. They had done experiments with cats in which they
measured voltages on nerves.

I began to read the paper. It kept talking about extensors and
flexors, the gastrocnemius muscle, and so on. This and that muscle
were named, but I hadn't the foggiest idea of where they were
located in relation to the nerves or to the cat. So I went to the
librarian in the biology section and asked her if she could find me
a map of the cat.

"A map of the cat, sir?" she asked, horrified. "You mean a
zoological chart!" From then on there were rumors about some dumb
biology graduate student who was looking for a "map of the cat."

When it came time for me to give my talk on the subject, I started
off by drawing an outline of the cat and began to name the various
muscles.

The other students in the class interrupt me: "We know all that!"

"Oh," I say, "you do? Then no wonder I can catch up with you so
fast after you've had four years of biology." They had wasted all
their time memorizing stuff like that, when it could be looked up
in fifteen minutes.

[ Parent | Reply to this comment ]

Posted by Anonymous (69.157.xx.xx) on Wed 25 May 2005 at 17:40
i'd say 'pedantic twat' ;)
for a newbie, a left-to-right flow is much more legible, even with the addition of 'cat'.

in any case, AWESOME article!
i can't wait to try this on my home machines.

[ Parent | Reply to this comment ]

Posted by Anonymous (194.47.xx.xx) on Wed 25 May 2005 at 06:59
I really like this article! Although I've used UNIX/Linux for a couple of years and really like the command line, I've never really used named pipes. I will now :-)

This article also gave me an introduction to netcat, which seems really useful as well - so I read steves article on that subject.

Anyways, this is a very useful site, and I've become a regular visitor.

// Simon

[ Parent | Reply to this comment ]

Posted by Anonymous (80.168.xx.xx) on Fri 27 May 2005 at 01:26
To distribute stuff to many machines at once you could use multicast. It's efficient and there's a nice little program for doing it in Debian called emcast.

Perhaps I should write an article about doing it that way.

Anyway, this was a really cool article. Well done.

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Fri 27 May 2005 at 03:52
Yup, multicast would be n-times more efficient for n hosts. Maybe that is why the current nettee has a limit of 8 next-hosts hard-coded into it.

I wasn't aware of the emcast app before, thanks. It looks like a good reason to replace netcat/tee here (heh, there's a reason I used the phrase "use and abuse" for the title of the piece). I'll play around with emcast after I've recompiled my kernels to support multicast.

PJ

[ Parent | Reply to this comment ]

Posted by Anonymous (69.234.xx.xx) on Mon 18 Jul 2005 at 04:22
Nettee is intended to implement daisy chain
message passing rather than the typical multicast.
In a daisy chain node i reads data from node i-i,
optionally uses this data locally, and then forwards
it on to i+1. Obviously this is only a plus on
a switched network and even then only if the switch
can handle the traffic. The multiple -next option allows
one machine to split such a daisy chain out
onto two or more subnets through a similar number of adapters. It seemed unlikely that many machines
would have anywhere near 8 network adapters so that
was set as the limit. (Easy enough to change if
one actually does have a machine with 9 adapters.)

The original use for nettee was the distribution of files
to N nodes on a Beowulf. For that one does:

NODE1: datasource | nettee -next NODE2
NODE2: nettee -next NODE3
etc.
NODEN: nettee

Somewhat ironically a good model for picturing this
process is the old thinnet ethernet. nettee is
analogous to the coax tee adapters with the
vertical connector going into the local machine and
the horizontal connectors going to the previous and next
segments of the network. (The parallel breaks down
when multiple -next options are employed since thinnet
could not be split that way.)

The original article did not mention it but nettee is
derived from an earlier program "Dolly" whose home page is
here: http://www.cs.inf.ethz.ch/CoPs/patagonia/#dolly

Regards,

David Mathog

[ Parent | Reply to this comment ]

Posted by Anonymous (212.201.xx.xx) on Fri 27 May 2005 at 22:53
Nice article. Here's something I was playing with the other day:

cat music.mp3 | ssh machine "mpg123 -s -" | rawplay

Since my link to machine is fast, and my box is a bit slow, I was thinking of extending this to kind of think to do remote decompression of movies (vlc) or regular files (tar/cpio).

Rabin

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Sat 28 May 2005 at 04:30
ssh will have a cpu cycles penalty associated with it due to the encryption it does. netcat does not encrypt (though, interestingly there is a version around that does, called cryptcat - apt-get install cryptcat). It may be worthwhile to benchmark the ssh version vs a netcat version to see how significant the cpu usage difference is.

The netcat version would go something like this:

slowbox:~$ nc -l -p 3456 | rawplay
fastbox:~$ nc -l -p 2345 | mpg123 -s - |  nc slowbox 3456
slowbox:~$ cat music.mp3 | nc fastbox 2345
(You set up the structure by going from the first line to the last line. But to understand it, you look at the data flow, which goes from the last line to the first line)

The ssh version is much more elegant, obviously, being a one-liner.

PJ

[ Parent | Reply to this comment ]

Posted by Anonymous (200.57.xx.xx) on Thu 2 Jun 2005 at 23:30
Im not shure this would work, but i think it would because gzip not necesarily waits upon when compressing does it?

So, before piping to netcat on source, you can |gzip |nc ...etc

On the playing box: nc -l ...|gzip -cd |rawplay... or a script if needed.

Would that compress the stream?

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Fri 3 Jun 2005 at 09:47
[about putting gzip in the chain]

Try it and see!

I just tried it, and got about 10% compression. gzip is not really designed for compressing audio streams.

The right tool would be flac, which would probably give you 30% or so.

Don't lose sight of the fact that minimal bandwidth would be to pipe the mp3 over of course, then play it. mpg123 will not play a piped mp3 from stdin though, which was one of the reasons pipes are being abused so thoroughly in the article. Else using a dedicated distributed audio player makes more sense.

PJ

[ Parent | Reply to this comment ]

Posted by Anonymous (203.122.xx.xx) on Fri 3 Jun 2005 at 11:06
Whoops. Dopey me. I should put in a caveat to what I just said.

I said "mpg123 will not play a piped mp3 from stdin", which is wrong. The "-" option of mpg123 lets me play an mp3 from stdin.

The problem is that if I send a bunch of mp3s in one go with:

source:~$ cat *.mp3 | nc calvin 2345
destination:~$ nc -l -p 2345 | mpg123 -

the destination almost always barfs on the 2nd mp3's header. This happens even with mpg123's "resync-on-broken-headers" option "y" on. So, mpg123 can handle headers once in a stream, but the second time it tends to barf. You can see it at its simplest with:

cat *.mp3 | mpg123 -
ie, it is not a netcat problem.

Hence my cheap 'n nasty solution of using mpg123 to convert to raw audio first, and then pipe the raw audio data around rather than the mp3 data.

[ Parent | Reply to this comment ]

Posted by Anonymous (150.216.xx.xx) on Fri 3 Jun 2005 at 18:38
With ZSH, you might do something like the following:

mpg123 -s *.mp3 > >(nc machine1 2345) > >(nc machine2 2345) ...

Given the multiple output-redirections, ZSH will perform the function of tee(1). The other key feature is the >(command) form, which is called process substitution: the shell will run the command between the parentheses and substitute in either /dev/fd/whatever or a named pipe, which FD or pipe will be connected to the command's standard input. You can experiment with stuff like this: echo foo > >(tr o e) > >(tr f p)

BASH supports process substitution, but not teeing on multiple output-redirections.

[ Parent | Reply to this comment ]

Posted by johnb (207.114.xx.xx) on Tue 7 Jun 2005 at 20:29
Don't forget to "setopt multios" in your zsh startup scripts, by the way, to those of you Trying this one At Home(tm).

[ Parent | Reply to this comment ]

Posted by lindenle (68.77.xx.xx) on Sun 12 Jun 2005 at 17:24
[ View Weblogs ]
Hi this article was awesome I am using the same technique to image a hard drive to a new one. First i booted both of the laptops to live cd's then i did:

local$ mkfifo ddfifo; dd if=/dev/hda | bzip2 > ddfifo
remote$ nc -l -p 5000 | bzip2 | dd of=/dev/hda
local$ cat ddfifo | nc remote 5000

Now I have my old filesystem perfectly set up on my new laptop....ta da.

Keep these fresh jams coming.

[ Parent | Reply to this comment ]

Posted by lindenle (68.77.xx.xx) on Sun 12 Jun 2005 at 17:57
[ View Weblogs ]
Second bzip2 should be bunzip2 oops...

[ Parent | Reply to this comment ]

Posted by Anonymous (92.233.xx.xx) on Thu 19 Mar 2009 at 14:12
It doesn't seem like fifo is really necessary in that case, and neither is cat according to the stuff about "UUoC" above. I haven't tried this approach, but it seems to me like it would solve the problem in two neat lines rather than three:

local $ dd if=/dev/hda | bzip2 | nc remotehost 5000
remote $ nc -lp 5000 | bunzip2 | dd of=/dev/hda

It seems a bit redundant to involve cat when netcat was designed to resemble the original cat as much as possible ;)

[ Parent | Reply to this comment ]

Posted by Anonymous (92.233.xx.xx) on Thu 19 Mar 2009 at 14:19
Also if it's your own private LAN and you wouldn't be getting in anyone's way by shifting massive amounts of data, maybe the on the fly compression might be a bit tedious? I don't think I'd do it that way unless I was also saving the output to a backup file as well as sending over the network; from my understanding of the use of tee in this article it seems that I could do that by piping through tee:

local $ dd if=/dev/hda | bzip2 | tee backup.bz2 | nc remotehost 5000

But this seems to be another usage case for that nettee stuff.

[ Parent | Reply to this comment ]

Posted by Anonymous (148.87.xx.xx) on Fri 23 Jun 2006 at 22:07
What a cool neat article!! Answered all my piping
questions in a fun easy to learn way. A little bit
of samba elaboration would REALLY solve ALL my problems
(my source is windows. Destination is linux).
Who is the author??

[ Parent | Reply to this comment ]

Posted by Anonymous (68.106.xx.xx) on Sun 27 Jan 2008 at 16:41
I never really understood what FIFO's were until I read this article. Thanks for writing it!

[ Parent | Reply to this comment ]

Sign In

Username:

Password:

[Register|Advanced]

 

Flattr

 

Current Poll

What do you use for configuration management?








( 531 votes ~ 7 comments )