MoviePy is a library for video editing tasks such as trimming, combining, and applying effects.
$ pip install moviepy
Load the video with VideoFileClip and overlay other Clips like TextClip with CompositeVideoClip. In the following example, text is displayed at 40% from the left and 70% from the top for 3 seconds, starting 5 seconds after the video begins.
from moviepy import VideoFileClip, TextClip, CompositeVideoClip, AudioFileClip, CompositeAudioClip
video = VideoFileClip("video.mp4")
txt_clip = TextClip(font="Arial Unicode.ttf", text="テスト", font_size=70, color='white') \
.with_start(5) \
.with_duration(3) \
.with_position((0.4,0.7), relative=True)
final_video = CompositeVideoClip([video, txt_clip])
final_video.write_videofile("result.mp4", codec='libx264', audio_codec='aac')

Similarly, you can overlay audio using CompositeAudioClip. In the following example, the original video’s volume is set to 20% and plays seconds 1-3 of the audio file starting 5 seconds after the video begins.
audio_clip = AudioFileClip("audio.mp3")
final_audio = CompositeAudioClip([
video.audio.with_volume_scaled(0.2), audio_clip.with_start(5).subclipped(1, 3)
])
final_video = video.with_audio(final_audio)