본문 바로가기
반응형

분류 전체보기272

Gradio: Real Time Speech Recognition Real Time Speech RecognitionIntroductionAutomatic speech recognition (ASR), the conversion of spoken speech to text, is a very important and thriving area of machine learning. ASR algorithms run on practically every smartphone, and are becoming increasingly embedded in professional workflows, such as digital assistants for nurses and doctors. Because ASR algorithms are designed to be used direct.. 2024. 12. 9.
How to Use Gradio Building Your First DemoYou can run Gradio in your favorite code editor, Jupyter notebook, Google Colab, or anywhere else you write Python. Let's write your first Gradio app:pip install gradio import gradio as grdef greet(name, intensity): return "Hello, " + name + "!" * int(intensity)demo = gr.Interface( fn=greet, inputs=["text", "slider"], outputs=["text"],)demo.launch()    https:/.. 2024. 12. 9.
Visualize STFT fft_size가 frame_size( = 25) 이상인 가장 작은 2제곱 값따라서 fft_size는 32frame_shift = 10   # -*- coding: utf-8 -*-import waveimport numpy as npimport matplotlib.pyplot as pltif __name__ == "__main__": wav_file = './data/wav/xxx.wav' frame_size = 25 frame_shift = 10 out_plot = './spectrogram.png' with wave.open(wav_file) as wav: sample_frequency = wav.getframerate() num_samples = .. 2024. 11. 27.
Visualize FFT 복소수 스펙트럼 계산:spectrum: np.fft.fft(frame)frame은 0.58초에서 시작 1024(2의 제곱승)만큼 크기 진폭스펙트럼 : 복소수 스펙터럼의 절대치 np.abs(spectrum)좌우대칭이므로 좌측만 사용 0~512까지 절대치 취함 로그 진폭 스펙트럼: 흔들리는 폭이 크므로 로그를 취함 np.logflooring: 진폭이 0인 주파수가 있으면 로그를 취할 때 마이너스 무한대 가능아주 작은 수인 1E-7(10의  마이너스 7승)을 더함 # -*- coding: utf-8 -*-import waveimport numpy as npimport matplotlib.pyplot as pltif __name__ == "__main__": wav_file = './data/wav/fft.. 2024. 11. 25.
반응형