Databending with Audacity is pretty popular here. Turns out there's also a command-line equivalent of Audacity(called sox) that can be used to process a batch of images or lots of frames in a video, I explain the method here.
These are called byte beats and have been around for a while. You can pipe them on Windows with a program like SoX. (http://sox.sourceforge.net/)
Essentially what is happening is you have a loop counter t (a time step if you will) that is increased for each byte outputted. Different tones are outputted and piped to an external program by manipulating this t variable using math.
Over the years many patterns have been found and you can combine them to produce drum-like sounds, a melody, etc. All of it is done by manipulating a single byte.
It's quite cool.
I wrote a program to do this for you. You need Sox installed. Compile it with MLton.
fun |> (x, f) = f x infix |> val (input, (output, proc)) = case CommandLine.arguments () of [rate, inputName, outputName] => ( TextIO.openIn inputName , let val proc = Unix.execute ( "/usr/bin/sox" , [ "-b", "16" , "-c", "1" , "-e", "signed-integer" , "-r", rate , "-L" , "-t", "raw" , "-" , "-t", "wav" , outputName ] ) in ( Unix.binOutstreamOf proc , proc ) end ) | _ => ( TextIO.output ( TextIO.stdErr , "usage: " ^ CommandLine.name () ^ " <sample rate> <input.csv> <output.wav>\n" ); OS.Process.exit OS.Process.failure ) fun loop () = case TextIO.inputLine input of SOME line => ( line |> String.fields (fn c => c = #",") |> app (fn field => field |> Real.fromString |> valOf |> (fn x => x * Real.fromInt 0x7fff) |> Real.toInt IEEEReal.TO_NEAREST |> Word.fromInt |> (fn x => [ Word.andb (x, 0wxff) , Word.>> (Word.andb (x, 0wxff00), 0w8) ]) |> map (Word8.fromLarge o Word.toLarge) |> Vector.fromList |> (fn x => BinIO.output (output, x)) ); loop () ) | NONE => ( BinIO.closeOut output ; ignore (Unix.reap proc) ) val () = loop ()
We used to do this in the late 80's using sox and SunOS 4.
These days I guess you'd use one copy of VLC acting both as a local player and as a network streamer and have additional copies of either VLC or even standard (win|mac) media player listening to the network stream.
Now if any new release had dynamic range that would justify using 24bit files, that would be great.
Music is compressed to a lifeless pulp nowadays. Compression will bring up the noise floor, which means that less bits are needed to perfecty represent the input signal.
I'm talking about 14, maybe 12 bits. Everyone should try this: take a song that doesn't satisfy my old man taste (I'm 27), take SoX, and use it to decimate the bottom 2-6 bits of the track. SoX is smart enough to use dither where appropriate. Try to compare the resulting track to the original, hearing any difference will be very, very hard
sox music.flac -b 16 temp.flac vol 0.25 sox temp.flac -D 14bit.flac vol 4
This will leave 14 effective bits in the file. Repeat with vol 0.0625
and vol 16
to get a file with12 bits of dynamic range.
edit: added -D
to disable dithering when shifting left
I discovered a voice practice hack. Find a way to echo what you say back to you with a really short delay. You can get instant feedback on your resonance, which is something I have trouble figuring out if I'm doing correctly. You also don't feel as alone when you're doing it, because it's like you're listening to someone else talk.
Install sox and run this:
rec --buffer 512 -c 1 -b 16 -e signed-integer -r 96000 -t raw - |play --buffer 512 -c 1 -b 16 -e signed-integer -r 96000 -t raw -
I'm running this on Linux and it works nicely.
If you have Matlab, you should use it instead of Octave, it's probably a lot faster.
To get the audio file into a format usable by Matlab, use SoX (http://sox.sourceforge.net/). The conversion is very simple: run sox recording.mp3 recording.mat
. That's it, you get a struct in matlab that has a wavedata variable that has all the sample values you need. For example, to plot it, first load the file with y=load('recording.mat');
then run plot plot(y.wavedata)
, or if it's stereo, plot one channel with plot(y.wavedata(1,:))
Max is another excellent conversion tool for Mac. Easy to use, supports loads of formats, etc. I have both on my laptop. Though if you want more in-depth things like sample rate conversion, Sound eXchange (SoX) might be your best bet.
Sites:
SoX - but it's command-line only so you'd have to get used to that. It is incredibly powerful for processing audio files though.
I did find a decent tutorial on how to install and use SoX to normalise a folder of WAV files:
​
You can use sox to bulk normalize the audio:
ls *.wav | while read audio; do
sox --norm=0 "$audio" "echo $audio | sed 's/.wav$/-processed.wav/'
"
done
On the off chance that you might either want to port your slot machine to a stand alone embedded device, or simply have access to all the source from top to bottom rather than just an OS API, or perhaps to see how to add in programmable effects, see the SoX library, it's robustly portable, is probably older than you are, and has a host of features.
<em>The Scientist and Engineer's Guide to
Digital Signal Processing</em> - free online book.
Sox - the Swiss Army knife of sound processing programs.
As a digitised signal continuously arrives a program can apply a digital filter to the signal that could extract a spectrum, modify the bass mix, apply an echo, et al - the modified signal can be continuously emitted with a small, barely perceptible, delay.
Audacity is a tool that should allow you to study the effects of digital filters and probably let you design your own.
Sox, linked above, is probably one of the simplest frameworks to make your own pass through filters for audio, and thus have your own "real time" effects for voice or music.
With sox (http://sox.sourceforge.net/), to join any number of audio files, you can do:
sox *.mp3 *.wav output.mp3
With Festival (https://wiki.archlinux.org/title/Festival) you can do command-line text-to-speech (text2wave -o output.wav file_to_read.txt
), so you could make it speak the filename, then output the audio file, and join them all in one file, in actually a very similar bash script using "cat" that others suggested.
Not ffmpeg but sox - maybe put this in batch in folder containing only files to be joined:
sox "*.wav" "long.wav"
Joins in order if you name files 1.wav,2.wav,3.wav etc. If you want file multiple times just duplicate and number.
Using sox or something comparable, you should be able to dump the actual amplitude values for each input file. If the difference between max and min amplitude in the output is close enough to zero, then delete the file.
Beware of converting audio files from one compressed format to a different one. I can't remember where, but I was told (or read) that you can lose a fair amount of quality doing this. It's better to go from uncompressed to the compressed format that you want. Of course, if you don't have access to the uncompressed audio you have no choice. There is a command line utility for doing such conversions. SoX
Audacity can fix that, but it is GUI driven. Sox (SOund eXchange) is the program that you are looking for.
By uneven audio, do you mean that the music is too loud for a certain parts of the song and too quiet at other points, then you need to Compress them a little. Look at the "compand" function.
By uneven audio, do you mean that some songs were just recorded or ripped too quietly, then you want to Normalize the audio. Look at either the "−−norm" or the "gain −n" command.
I've never done it before, but I'm pretty sure I heard a colleague say he used it to either strip or add silence to a bunch of sfx.
I'll ask him how he did it later today and get back to you.
Edit: So he used SoX to recursively search through a directory tree and strip silence off of the end of many audio files. He's trying to remember how he did it, and we're gonna meet later today so he can show me.
In the meanwhile, you should check out documentation for the SoX editing effects: http://sox.sourceforge.net/sox.html
Take a look at the pad, silence, splice, and trim functions.
Hopefully I'll have some more info in a few hours, and maybe even a sample script...
Edit2: Okay, so I have a sample batch script (http://pastebin.com/AeNLtPh3), but it's not the one he used to chop up audio. It looks like it contains stuff that you probably already know how to do, so I don't know how helpful it will be. The script recursively applies normalization & compression to wavs in a directory structure and converts them to ogg. You can probably use this as a skeleton for your script. Replace the sox commands with the ones that you want. I don't know exactly what you're up to, but I think yo'll want to use trim. For example, sox input.wav output.wav trim 0 10
will record the first 10 seconds of input.wav into output.wav. Adding on a norm -3
will normalize the output file to -3dB. You can use rec test.wav trim 0 0:10
to record the first ten seconds of your default audio device to a file called test.wav. I hope this helps.
SOund eXchange or Sox is command line based. It is the Swiss Army knife of sound processing programs. It has a compressor, and it can play and record audio, too.
I am just riffing off the cuff, so anyone step in if my idea doesn't make any sense.
As long as you don't need to merge the audio back with the video files, I can sort of seeing you could probably do this with SOX and a scripting language [Python is my choice].
Just export the data in the spreadsheet to something you read into Python/SOX and then have the Python script build a SOX command[s] for each video.
The Python part is more or less the easy part. Making the SOX command is a little trickier. SOX is a great tool to keep around and its commands are not hard, they are just not always intuitive, especially if you doing something complex. [From my experience it is best to find the sequence of commands to do and chain them together.]
But once you get a basic SOX command[s] for what you need to do for each video the rest should be cake.
I don't know how to do it on Windows, but here are instructions for Linux.
I obtained the audio by configuring ALSA to write all audio to a file:
pcm.!default { type file slave.pcm hw:0 file alsa_output.wav format wav }
Then I used SoX (which does work on windows, by the way) to generate the spectrogram:
sox alsa_output.wav -n spectrogram
To generate the spectrogram for the notched audio, we can use the bandreject
effect before spectrogram
:
sox alsa_output.wav -n bandreject 15.625k 15q spectrogram
I recommend SoX for audio processing. Since it's command-line, it's pretty easy to come up with a script, say, for all files within a folder.
Also their resampler is REALLY good and highly configurable
K, I went ahead and programmed it up. Here is the download. Wanna test it out and tell me how it goes? I tested it by replacing Menu_Open.xnb and it works in game. I'm not sure how thorough or resilient the code is, so if you run into any problems, let me know. Also, make sure the WAV files are PCM. For good measure, I always make them 2 channel, 44100 hz, 16 bits PCM wav files. If you have any trouble with converting audio, I use this: http://sox.sourceforge.net/
There's always sox http://sox.sourceforge.net/
The command would be
sox infile96k.wav outfile44k.wav rate 44100
With a unix terminal (using Linux, Mac ports, or Cygwin on windows), to convert all the 650 samples, utilizing all your CPU cores (using the gnu parallel app) you'd run
mkdir /home/steffeh/samplesdir/44k find /home/steffeh/samplesdir/96k -iname "*.wav" | parallel sox {} /home/steffeh/samplesdir/44k/{/} rate 44100
( {} is replaced with the filename, {/} is replaced with the filename without the directory path )
Other softwares for spectral analysis:
Adobe Audition (Windows or Mac OS)
Audacity (Windows, Mac OS, Linux)
SoX (Windows, Mac OS, Linux — command line only)
>I've tried using iTunes' internal file converter to make Apple Lossless versions of the files, but they remain with extremely high Bit Rates (2000+), and at 24Bit depth.
I downloaded the Ecozoic album that you linked to and converted it to ALAC and iTunes shows the bitrate as ~1,700kbps (same as FLAC). Lossless compression does not change bit depth.
>Is this just BandCamp's conversion system making a mistake, and not actually compressing the files, but still putting a lossless file container/extension on them?
There's no mistake. The files are definitely compressed. The size of the Ecozoic album downloaded from Bandcamp as FLAC – excluding tags – is 1.62 GB (1,748,996,215 bytes) and the size of that same album as uncompressed WAV is 2.14 GB (2,307,791,926 bytes).
>In order to achieve a uniform collection of music which has an uncompressed sample rate and depth of 44.1Khz 16Bit, and similar file sizes... would a 2 step process of using iTunes file conversion to create a 44.1Khz 16Bit file - then converting to Apple Lossless - be adequate?
Sure, that should work out fine.
>Is there a better/best way to take a 48Khz or 44.1Khz 24Bit file, and convert it to a 44.1Khz 16Bit file... without using a DAW and/or specialized dithering software?
You might like to try out SoX (Sound Exchange) as it's know for high quality resampling and bit depth conversion. SoX Wrap and QSox are a couple of GUI frontends for SoX if you want to avoid the command line (although the command line offers the most options).
SoX has the sinc
effect that you can setup as a very steep highpass, lowpass, bandpass, or band-reject filter.
It also has the spectrogram feature that plots an amplitude vs. frequency vs. time graph.
Interesting. A file like that would help splitting files the audio file with sox. If one didn't want to split the files you could silence the files (also with sox).
Sorry for the late reply. This is all the juice I managed to get out of it so far. As you can see, it's importable, but terribly clipped and unusable.
Do you have any files from the same set of recordings that work fine? If not, then at least a couple of files that don't work similarly to this one would help a lot. I'm pretty sure the problem's with the corrupted header, not the audio blocks themselves. Looking at what's similar and what's different between several wavs will be helpful. Also, do you have any information on whether the recordings were stereo or mono? Information about what software and settings were used to digitize those files in the first place will be immensely helpful as well. IMA ADPCM is a very rare format, and is identical to several others. The only hint I have that this is in IMA ADPCM is one byte in the wave, which could be one of the things that are corrupted.
Anyways, if you still want to convert all the other files like this one (importable but horribly clipped), you can do that with SoX. After installing, start cmd, go to the SoX installation path and drag and drop sox.exe to the command line, add " -t ima -r 44100 " (without the quotes), then drag and drop the file you want to enter, add "C:/file.wav" at the end (WITH quotes this time, and of course the space at the beginning) and hit enter. This should create a file.wav on C:/ that should be importable in a DAW. An example of the correct string in the command line is > "C:\Program Files (x86)\sox-14-4-1\sox.exe" -t ima -r 44100 "C:\Program Files (x86)\sox-14-4-1\Nights Alone - piano.wav" "C:/file.wav"
You could create a batch version of this pretty easily, if you know batch programming.
What OS are you running? It would probably be pretty easy to do with SoX and an sh script or a batch file. I don't think SoX will decode ALAC though, so you may have to convert the tracks to FLAC or WAV first...
The "speech" ROMs have CVSD encoded audio onboard. It's a matter of using the right tooling in order to dump the raw audio frames to an audio manipulation suite. I use ye olde sox.
Maybe look into SoX, a great command line tool for audio manipulation. I use it mainly to transcode audio from one format to another, but it can also combine files and apply effects. From the manpage:
> SoX reads and writes audio files in most popular formats and can optionally apply effects to them. It can combine multiple input sources, synthesise audio, and, on many systems, act as a general purpose audio player or a multi-track audio recorder. It also has limited ability to split the input into multiple output files.
Ah, well if you like that are then here's a 'challenge' . . .
Get SoX - it's a command line cross platform audio manipulation package.
From the command line it will convert audio from one format to another and optionally apply filters / change sampling rates. One of the formats it accepts is RAW - this is essentially what you get straight out of an A/D converter.
What you might like to try is to build SoX with debugging info and then trace it as it processes blocks of piped in RAW data and does stuff (apply filter / resample / etc).
One essential library / command line toolset to have is Sox or an equivalent.
It's open source and demonstrates capture, format conversion, basic effects generation, and simple filters.
Ok, if you already have the audio files, and they're prerecorded in such a way where you can mix and match them to generate a new audio file, then what you most likely want is a script that runs a series of SoX Sound Exchange commands. You can write the script with Python if that is what you're learning, but you might want to familiarize yourself with how terminal applications such as SoX work in unix style command lines before you attempt it. Once you're comfortable enough using SoX to cut / paste the audio clips together into a new audio file using the terminal, then you can start to automate that process with a script. This should get you started.
https://www.davidbaumgold.com/tutorials/command-line/
https://bham-carpentries.github.io/2018-12-17-bham_python-novice-inflammation/05-scripts/index.html
At OP, what jumpfetus said is very much right. I'd figure I'd mention a program that can do what exactly what he said ie Sox http://sox.sourceforge.net/ it can be a bit tricky to use I think but give it a go
Ok, I found exactly what I was looking for.
rec −r 44100 −b 16 −e signed-integer −p \
silence 1 0.50 0.1% 1 10:00 0.1% | \
sox −p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
newfile : restart
Did a bit of digging on Github and the author confirms
`Only raw bin files are supported as tracks in cuesheets.`
That means you have to convert wav to bin for Duckstation :(
The only way I know how to do that off of the top of my head is with the command-line only tool SoX.
sox.exe "input.wav" -t raw -s -c 2 -w -r 44100 "output.bin"
Its not easy. You will need underlying software that can do the conversion. So use your knowledge of html and js to build a website where users can upload media. Then you need to take that uploaded media, process it through some back end software, and then return the result to the user. Or host the result and provide a link to the end user. For example if you want to convert audio to a different format you need to process it on the back end with something like SoX - http://sox.sourceforge.net
I'm not an expert, but I suspect there is a way to get what you want using the sox utility if you can handle using a command line.
It can detect and report many statistics about audio files, for instance max amplitude. If there was a characteristic that made the corrupt files stand out you could have it process all files and filter the output for that signature to get a list of the ones yours looking for.
Probably would take some research and time but I think it might be possible. If you Google for "Sox detect corrupt audio" there are some results that aren't too far off what you'd need.
You may be able to play with the copy command to make it work as well. Which probably isn't the best solution. Sorry I don't do this type of thing. Command is: copy /b file1.mp3 + file2.mp3 + file3.mp3 finalfile.mp3
You could probably play around with a batch script to sequence the files in a folder and join them in the proper order using that command
The other options would be to use a utility such as Audacity or SOX. I don't use these tools often so you are on your own in using google to figure it out.
I found an answer (really, right after I posted).
Sox is a command line utility that does the job pretty well. Still have to type each file name, but it can then grab all six channels and merge/convert into a multi-channel wav file.
For posterity: I did some further googling and found SoX / Sound eXchange http://sox.sourceforge.net/
It can do a ton of effects and filtering from the command line, which fits my use case perfectly.
IMHO, the simplest way is to use SoX. As long as the files have the same sampling rate and number of channels, it's like this:
$ sox file1.flac file2.flac file3.flac merged_files.flac
In my view, if you do any audio file manipulation, SoX should be in your tool box.
I usually use SoX for this. However I can appreciate that this is a pretty technical piece of software. There might be a graphical front-end for batch conversion, but I don't know of one off the top of my head.
​
Under Linux or OSX I would do this:
cd where_the_wav_files_are mkdir out for a in *.wav ; do sox $a -r 22050 -b 16 out/$a ; done
This will create a directory called 'out'. All .wav files in the other directory will be converted to 22Khz, 16-bit signed .WAV files and placed in the 'out' directory.
For windows using the cmd.exe as the shell, I think it'd probably be:
cd where_the_wav_files_are mkdir out for %a in (*.wav) ; do sox %a -r 22050 -b 16 out\%a
...all of these examples assume that SoX has been installed on the system path so it can be run from any location. If you don't want to do that because this is a one-off, you could copy the .WAV files you want to the directory where the SoX executable has been placed and run it from there.
If you're using PowerShell - good luck.
This might be more of a SoX thing.
You could do each file individually in Audacity using Amplify and then Limiter. But there would be... finesse to all of that. You would have to sort of look at each individual file and consider its peaks and valleys, then select a level you want the Limiter to be set to.
In general, older recordings have a greater dynamic range. Everything in modern recordings is compressed and maximized to a bewildering extent. It all just sounds LOUDER even though it technically isn't. Look up "Loudness Wars" for a description of this.
As an alternative approach, your car audio system or phone may very well have an automatic leveling feature that recognizes LOUD modern mixes and "quiet" older mixes and adjusts the volume automatically.
Huh. In terms of uncompressed formats, I use AIFF instead of WAV because it offers a lot more in terms of metadata. So for losslessly compressed, I'd use ALAC. It really depends on what you're using them all for I guess.
For batch processing, you can in theory use Audacity but I don't love that and it doesn't work on Catalina, the current macOS version. I own Compressor from Apple, which does a great job but I think costs a one-time $50.
If you're comfortable with command-line tools, something like SoX could work well for you, it seems super powerful for batch processing but is predictably obscure. It even supports m3u playlists. Note, however, that it doesn't support ALAC!
You could certainly use PowerShell to script some utility that either actually normalizes the audio or adds "replaygain" info.
sox is a handly command-line general-purpose program for manipulating audio, but it might not be the most appropriate for the format(s) that you're working with.
With very high quality interpolation, there is basically no audible difference. If you are interested in that topic you might want to look up this: http://sox.sourceforge.net/SoX/Resampling
Sox is THE audio library for audio and well tested.
Thanks for the feedback. Lot to unpack here.
> Man I wish people didn't input 2d images into networks.
Why do you say this? 2D locality is literally what makes convolutional networks so effective.
> How about just 1d slice, say FFT of 1024 points, yielding magnitude spectrum of 512 values or something?
This is a very basic approach, and does not work as well as you think.
Pitch perception is an inherently human thing, and relies on lots of factors, including physiology (like the shape of the inner ear, your brain structure, etc.) and time-scale. Some instruments take longer ease into pitches, and humans need processing time to recognize them. When you start trying to recognize chords, you have even more harmonic complexity, that is very difficult to analyze with just FFTs. (A lot of this complexity has been researched heavily, and is well understood.)
This is why spectrograms work so well -- the time dimension allows convnets to detect both time- and frequency- local patterns.
Note that for Pitchy Ninja, I actually use a more traditional pitch estimation approach (it supports autocorrelation, YIN, and AMDF), obviously not using convolutional networks. It works well in most cases, but tends to mis-predict harmonically rich instruments.
> aliasing-free resampling is not that difficult
:-) "not that difficult"
Sox has pretty good resampling: http://sox.sourceforge.net/SoX/Resampling
If you have any scripting experience SoX has a spectrogram generator. It wouldn't be hard to pull the eps from the rss feed, generate a spectrogram for each one and then look for the one that has a gap in it. I'd do it, but I'm spending my MLK holiday painting my stairwell and hallway. Ugh painting sucks
On un*x systems you can install sox
$ soxi -B audio.wav 128k
Quick --help
:
Usage: soxi [-V[level]] [-T] [-t|-r|-c|-s|-d|-D|-b|-B|-e|-a] infile1 ...
-V[n] Increment or set verbosity level (default is 2) -T With -s, -d or -D, display the total across all given files
-t Show detected file-type -r Show sample-rate -c Show number of channels -s Show number of samples (0 if unavailable) -d Show duration in hours, minutes and seconds (0 if unavailable) -D Show duration in seconds (0 if unavailable) -b Show number of bits per sample (0 if not applicable) -B Show the bitrate averaged over the whole file (0 if unavailable) -e Show the name of the audio encoding -a Show file comments (annotations) if available
> I just need software that can add an adjustable amount of latency (from 100ms to at least 5000ms) because I can figure the rest out if need be.
As in a utility that you can (at command line level) pipe streamed audio channels into and out of with one delayed?
SoX has been doing this since 1991
http://sox.sourceforge.net/sox.html
Sox <stdin format args> - delay <delay args> -
Is the invocation to accept raw stream data (described by the format args) on STDIN and passthru to STDOUT subject to the delay arguments (for individual channels).
You're making us work here. My interpretation of the image is that you have a USB-C to 3.5mm audio output lead so that when you plug it in to your phone the speaker output goes along that cable. (The cable you show looks like a left+right+video cable which may not work). Then you're combining left and right with the little RCA to 3.5mm TRS adapter.
Mysteriously you show a raspberry pi with a USB microphone plugged in to it. I think Raspberry Pi only has a 3.5mm audio output socket on them.
So my answer is no.
My advice:
Note that you'll have two USB audio input devices on the Pi and I'm not sure if programs like Audacity can handle two different devices. You might be able to do something with sox or similar.
I do this regularly as I have an MPC1000 too, you can use a lot of tools to do this :
Command line tools such as SoX work great too. Example of a command :
sox sample.wav −b 16 mpc1000_ready_sample.wav rate 44100 ↑ ↑ | The converted file | that will be created Your original sample
This is a 12-channel .wav file. Many tools can't play this file. Dropbox can't preview it, but I can download the file, then SoX can play it. SoX is a set of command-line tools for processing sound.
The command soxi BGM_NetworkMainScene.wav
identifies the file:
Input File : 'BGM_NetworkMainScene.wav' Channels : 12 Sample Rate : 32000 Precision : 16-bit Duration : 00:00:33.16 = 1061225 samples ~ 2487.25 CDDA sectors File Size : 25.5M Bit Rate : 6.14M Sample Encoding: 16-bit Signed Integer PCM
These commands play the file:
play BGM_NetworkMainScene.wav
combines all the channels.play BGM_NetworkMainScene.wav remix 1 2
plays only channels 1 and 2. (I have a 2-channel stereo speaker, so I pick a left channel and a right channel.)play BGM_NetworkMainScene.wav remix 9,11 10,12
combines channels 9 and 10 with 11 and 12.play BGM_NetworkMainScene.wav remix 5,7 6,8 trim 0 14.2 repeat -
loops the first 14.2 seconds of channels 5 to 8.​
You might trying looking into SoX: http://sox.sourceforge.net/Docs/Features
If this is something that looks like it could work, then here is a Go interface: https://github.com/krig/go-sox
Also check out: https://github.com/avelino/awesome-go#audio-and-music
SoX has some of the best SRC on the market (many products license it for their use) and it's a command line utility that can easily do batch processing.
Looking at your source code now, I see! Yeah it is kind of funny that you went to the trouble of making the waveform in Bash but wrote a C program to mix it.
Another strategy you could use: output MIDI to Timidity.
If you're going to go the route of using native code for mixing, I recommend you check out Sox (you can probably install it using your package manager), it has command line options for both generating wave forms and for mixing. If I were going to write a sequencer in Bash, it's the method I'd use.
To mix waves together, you need to sum them together into an array of integers before dumping it to the WAV stream. So declare an array and fill it with samples (although this might be a bit slow in Bash):
declare -a BUFFER=(); sum_to_buffer() { let FREQ="${1}"; shift; let PW="${SAMPLE_RATE} / ${FREQ}"; let HALF_PW="${PW} / 2";
# This generates a sawtooth wave, rather than a square wave. for (( i = 0; i < $HALF_PW; i++ )); do let BUFFER[$i]="((${i} * 255) / ${PW}) + BUFFER[$i]"; done; for (( i = $HALF_PW; i < $PW; i++ )); do let BUFFER[$i]="((${i} * 255) / ${PW}) + BUFFER[$i]"; done; } buffer_as_Cstring() { # first format a string containing the content of the buffer as a C-string: for i in "${BUFFER[@]}"; do printf '\x%X' "${i}"; done; } dump_buffer() { printf "$(buffer_as_Cstring)"; }
I use SoX on command line for converting audio. It's easy-to-use and very convenient.
Sometimes I prefer a GUI; then I use Foobar2000 (which in itself is a slick, but strongly extensible audio player for Windows.) as my GUI for audio transcoding purposes.
SoX works quite well with everything I've thrown at it (including the 8SVX and raw 8 bit samples most Amiga software use). It's a great way to batch convert some samples to something Amiga compatible (and back)
What do you mean by > intend to attempt to replicate an audio waveform using the Fast Fourier Transform
Do you want the maximum amplitude of the whole signal, the signal in the frequency domain or something else?
In case you want the amplitude of the whole signal, Sox may help. In command line:
sox filename.mp3 -n stats
or for non db scaling:
sox filename.mp3 -n stat
A hardware compressor is the best way to go.
Yes, there will be latency with the Pi, but how much? If it's under a fraction of a second, will it be noticeable?
If you wanted to try, you would first need a sound-card with both a DAC and ADC. Most of the cheap USP sound-cards only have a mono ADC for a single microphone.
Take a look at either the Behringer U-Control UCA222 USB, or the Audio Injector HAT. I have the Behringer. It's very good. I don't know much about the Audio Injector.
The second step would be to pipe the audio in stream through a program into the audio out. Take a look at SOX.
http://sox.sourceforge.net/sox.html
https://linux.die.net/man/1/sox
I never used it on windows but it looks like there's an installer: http://sox.sourceforge.net/
Under linux it's fairly simple to use shell scripting to do what you need, in windows you can use bat to automate, like in this example:
http://sox.cvs.sourceforge.net/viewvc/sox/sox/scripts/batch-example.bat?view=markup
HTH
Do you want to put the wav file as-is into progmem, or convert it to a stream of samples first? In the first case, see here.
If you want to just put the samples into progmem, SoX can convert wav files (and other audio formats) to C files which contain an array of the data, that you can include (or paste into your arduino code). Just preface the resulting array with PROGMEM to tell the compiler to put it there.
You may be interested in learning about the "Sox" tool. It is a command line tool (UNIX, Linux, MacOS, Cygwin) that lets you manipulate sound data, and apply filters and effects: http://sox.sourceforge.net/
It is open source, so you can look at the source code to see how things work! https://sourceforge.net/p/sox/code/ci/master/tree/
Well, your comment is not very descriptive, so here are some things somebody can miss.
extract
on soundbank files, the files in directories inside
sounds
folder.
I didn't test it on other resources in game directories,
but I doubt there are more sounds in any other game files.reformat
program.reformat
program on sound files more than once.
It will change files again and make them wrong.For example, here are spectrograms of a sound file reformatted normally and what it decodes like if you reformatted it one extra time.
Also, if there are some files which sound fine right after extracting from a soundbank but before using reformat
program, and get messed up by it, I would like to know.
https://www.dropbox.com/s/ip2f4ylw62wxbo6/python_verification_scripts.zip?dl=0
Both scripts assume you preserved the directory structure from flac to mp3. So if you have a flac at $FLAC_ROOT/ph1995/ph1995-11-16.akg-c1000.flacf/ph1995-11-16d1t01.flac, the scripts will look for a file at $MP3_ROOT/ph1995/ph1995-11-16.akg-c1000.flacf/ph1995-11-16d1t01.mp3
You'll need to change the paths in the scripts to point to the top level directory where your files are. Each script has one line with the FLAC path, and one line with both the FLAC path and the mp3 path. Neither script is very long, so this should be simple.
findMissing.py will print out any files that don't have a match.
compareLengths.py will print out any files that are more than 2 seconds difference in length from FLAC to MP3. This script requires SoX (http://sox.sourceforge.net/), which is a separate utility that measures the lengths of the files. You can install this with homebrew on a Mac. I looked pretty hard for something built in to python, but I couldn't find anything, so this was the next best option.
I'm happy to answer any questions.
You might find SoX really useful for this, it is a command line utility that can do all sorts of legwork on audio files. You can strip silence based on a duration and threshold.
First download the correct version of the program for whatever OS you have
Install it correctly (follow the appropriate Documentation) for your OS.
Once that's done, you can use any command prompt on your system - that might be "Terminal" (found under Applications, Utilities on OS X) or cmd.exe / PowerShell or even the "Run" box on the Windows start menu depending on which version of th OS you're using.
You'll probably need to modify the commands I gave to include the 'path\to\the\program.exe'
as well as the "path/to/the/files"
including quotes (or other means) to escape any spaces or other chars that your OS considers "special", and don't forget to use slashes in the right direction for your OS and call it sox.exe
if using Windows.
Some OS's let you drag and drop the relevant files / folders to the command prompt which can save both time and typing mistakes, making the process more straightforward, especially if you're not used to this way of working.
Good luck, and if you need more help, include specific details of what you're using, what you've done & what you get.
Nothing I know of that is available but I know it's possible. At a previous job I wrote a script that trimmed silence off of the beginning of files using sox. This would be the same thing essentially, twice then merging them. Check out sox if you want to get your hands dirty:
Edit: Disregard what's below. OP wants to output as ALAC, not FLAC.
>You're saying that SoX needs ffmpeg installed in order to be able to do the conversion in one step?
SoX should be able to accomplish the whole process by itself:
>http://sox.sourceforge.net/soxformat.html
>.flac (optional; also with −t sndfile)
>...
>SoX can write native FLAC files according to a given or default compression level.
Pulse code modulated audio (WAV, BWV, AIFF formats) is essentially volume to time -- each sample is a value, and you get some number of samples per second.
To do the sort of analysis of the values, I have used the Sox (Sound Exchange) program to convert the WAV file into ASCII text decimal numbers. My recollection was that I got a comma-delimited file out of SoX.
I then just imported into Excel and use the stats tools there. In the '90's this meant that I would double check with a small subset of the file that my batch analysis would work, and then set it going on the complete file as I left work for the night.
I discovered from this process that my department had someone with key access to all offices who was so upset by computers left running overnight that this person would go into such offices and unplug the offending computer. Covering the window in the office door fixed this problem.
That could be done (assuming reading and writing WAV format is a solved problem in those languages), but one advantage of using a tool like sox is it can also do dithering after the samples are scaled down. From the sox documentation:
> For example, adjusting volume with vol 0.25 requires two additional bits in which to losslessly store its results (since 0.25 decimal equals 0.01 binary). So if the input file bit-depth is 16, then SoX’s internal representation will utilise 18 bits after processing this volume change. In order to store the output at the same depth as the input, dithering is used to remove the additional bits.
> Use the −V option to see what processing SoX has automatically added. The −D option may be given to override automatic dithering. To invoke dithering manually (e.g. to select a noise-shaping curve), see the dither effect.
SoX is probably the tool you're looking for. It'll take a bit of digging through the documentation, but it should do what you need. I think the remix
option may do what you want, but I haven't tested it to be sure.
What are some good tools to automate the sample-editing process? Anything off-the-shelf, or are you using custom tools?
I tried both SampleRobot and Extreme Sample Converter. They both have their strong points but I need to make sure my multi-mic samples (close/tree/far) stay phase aligned-- both start points and loops / crossfades need to match.
I know some Python so I did some experiments with to edit wav/aiff files. The AIF module can read/write markers which I thought would be useful for loop points-- unfortunately WaveLab ignores them and only reads "Instrument" markers for loop points.
So-- anything good/useful for this? http://sox.sourceforge.net/ ? Or the scripting in Reaper? I was disappointed that WaveLab's batch processing can't edit loop points.
Well, the thing is DTS (the company behind DTS-HD, one of the audio formats on Blu-ray) recommends that studios encode their audio without compression.
Dolby, on the other hand, recommends that studios encode their audio with DRC. One thing to note is that the DRC mixing information is stored as metadata within the audio. Some decoders may choose to ignore the extra metadata and play the audio without any of the DRC.
SoX is a good tool for dithering, mixing, and compressing audio. I prefer to do the DRC on the playback side, so I use an HTPC with MPC-BE (similar to MPC-HC), which has a built-in normalization function. ac3filter is another filter that can be used with MPC-HC/MPC-BE and supports more control of DRC.
interesting projet, only thing I wonder is why ableton? Are you only looking at mixing the audio files together? If so I think ableton is overkill, you could do this more simply using http://sox.sourceforge.net/ and whatever scripting language you prefer
Same applies for SoX, if you can use the command line, SoX can be of help.
http://sox.sourceforge.net/Main/HomePage
Some examples on usage:
http://www.thegeekstuff.com/2009/05/sound-exchange-sox-15-examples-to-manipulate-audio-files/
Sox has always what I've used to covert audio ... but I'm kind of an old-hat, here. Not sure if it'll specifically do aa files, but it does most things under the sun...
No, they're Linux commands. But, as with a lot of projects on Linux, it's not unlikely that a port for Windows exists.
Oh, yes. This is the SoX project homepage, and I clearly see a mention of a Windows installer.