I started my career back in the late seventies and on IBM mainframes there was either COBOL or assembler - with the 2.5 KB real memory my first system (already outdated at that time) had the choice was quite clear: learn assembler and every dirty trick there was to save some bits here, some cycles there ....
Later, in the eighties, i wrote software for DSPs and embedded systems, mainly the DSPs of Analog Devices and the Motorola 56k-series. Again, it was assembler to the fullest extent: when you write the internal software for a telephone switchboard and the company wants to sell a few millions the only thing that counts is: how many cycles did you use for that task? The lower cycles, the lower the clock rates could be, which means you can buy lower graded CPUs, which cost exponentially less which means a BIG difference in customer price. If you could save one arithmetic unit or one register, the custom-built processors could be built without these parts which made them cheaper too.
We used to beat optimizing compilers by a very large margin. OK, we couldn't decipher our own code three months after "tape-out" (the tasks got simulated on an ICE and finally the tape with the processor specification was sent to the provider of the DSPs to be built on order), but it was expected to write the next version from scratch anyway.
One of my first jobs on a PC was to write a DOS device driver for a specialized file system for the real-time storage of acoustic data. This was the time of MFM-disks and the data rate for a usual 16-bit sampling in stereo frequency (44.1 kHz) is just a tad below the bandwidth of the ST-506 interface. My driver had to use every ounce of bandwidth it could get just keep up. Mind you, we had one of the brand-new hot 16 MHz NEAT 286, an extremely fast system!
Sure, these are extremes. But ones uses assembly language for the same reason we used to use assembly throughout computing history: for its unparalleled speed and for the sheer control it gives you over the system and its hardware. C is a language nicely suited for rapid prototyping, but i'm still quite confident to be able to beat any C-compiler in terms of speed of execution with hand-crafted assembler code. Even with all those nifty optimizations switched on.
This is nothing to say against C - in fact i like C. But to really understand what C is about you have to have experienced the problems solved/avoided by its use. You have to have been at the very bottom at least once to appreciate being on top. And, who knows, you might even start to like being at the source of things.