Documentation
Voice Chatroom
Overview

Overview

Last updated：2022-03-22 13:06

1 Introduction

The language chat room refers to a virtual room with online voice and microphone. Each room has 5-10 microphone positions. The host chats on the microphone and broadcasts the entire room live so that other viewers can enter the room to watch. The anchor can also invite the audience to interact with the microphone. The number of microphone positions in different room types and the maximum number of viewers in the room are different. With the development of the audio and video live broadcast industry, because the information density that can be carried by voice is richer than that of text and pictures, the threshold for use is easier than that of video. It is a natural social tool. Many products have tried in the social field by chatting in language. , Such as workplace social maimai, voice social fish ear, entertainment social singing, Video blind date, etc., focusing on a specific scene and becoming an in-depth social tool for attracting specific groups.

2 Application Scenario

In the language chat room scene, the homeowner and several on the microphone users interact online by voice, and there may be audiences who cannot speak but can only listen. They interact through gifts and chat messages. Different room themes are usually set to attract users with the same hobbies to watch and interact. Common themes are: dating and friends, FM radio, karaoke chat, game interaction, live events, private theater, etc.

Voice dating room: matchmaker as the host, N users on the microphone as guests, matchmaker is responsible for controlling the field and driving the atmosphere, throwing topics and promoting the game. In the process, the guests deepened their understanding of each other, showed their personal charm, and expressed their love for other guests.
Emotional escort room, voice radio room, etc. Online FM: There will be a single live broadcast by the host or a host and a few regular chat guests. Background music and sound effects will be played at the same time. The audience under the microphone can give gifts to the microphone to participate in voice interaction.
KTV chat room: Generally, there will be an administrator, the others can order, comment, guess and sing, etc. It is mainly divided into two modes: multi-person with microphone and multi-person rotation microphone. multi-person with microphone is the lead singer of one person, and other with microphone users can talk while listening. The lead singer cannot hear the voice of other connection microphone speakers, but the audience in the room can hear all the voices. multi-person rotation microphone mode is to sing a song by one person after the song is ordered, and then the next person will automatically sing after the song. Other users can only listen while they are waiting, only comment and communicate, not chat.
Interactive game room: Werewolf kill, script kill, pia play, truth or dare, you draw me guess, etc. In this scenario, rooms will be created according to the game process, and the players who control the speaking in the game progress business will speak in order.
Event live broadcast room: There will be audio and video broadcasts of the host and the event. The audience in the room will discuss the game with the host with microphone according to the established business logic. For example, Migu live broadcasts, through the Zego's ability to build a cloud director, integrate the live video stream of CCTV5 and the platform's anchor commentary to create the experience of watching the game with netizens.
Private cinema room: There will be a homeowner and several users on the microphone watching movies and dramas together, and they will complain while watching in the same room.

3 Scheme Realization

The language chat room scene mainly contains two roles: the user on the microphone and the audience under the microphone. The description of each role is as follows.

the user on the microphone

A person with microphones creates a room and becomes an administrator.
A person with microphones invites other users to enter the room.
A person with microphones starts to push the stream.
The microphone-linked person B starts to push the stream and pulls the stream of the microphone-linked person A to interact with the microphone-linked person A.
Start the mixed streaming service and record by CDN.

the audience under the microphone

C audience, D audience, E audience, F audience, etc. enter the room.
The audience pulls the streams of A and B connected microphones and listens to the interactive audio of users on the microphone.
The audience interacted with users on the microphone by giving gifts and room messages.

4 Scene Advantage

the high sound quality and low latency of language chat provide a stable basic experience

weak network resistance guarantees the stability of the experience
excellent self-developed engine guarantees sound quality
global nodes guarantee low latency
high concurrency and stability during night peak hours

the best practices of the language chat room provide a more comprehensive security guarantee

authentication function and prevent fried microphone solution
quickly switch rooms
one-stop solution for content review
with the microphone's synchronization and solution high availability guarantee

rich language chat room additional gameplay brings more value-added functions

more effects such as voice-changing, pitch-changing and reverb, enriching gameplay and fun
two-channel effect and audio pre-processing/external acquisition
play sound effects/BGM
audio spectrum sound waves

5 Function List

the main function	function description
login room	users on Mic and audience under Mic can perform push-pull streaming and other functions after logging in to the room.
push stream	push your own audio stream, mainly for users on Mic to push their own audio media data out.
pull stream	play audio streams, mainly the audio media data of the users on the microphone.
room signaling	sending and receiving messages is mainly for the audiences of the audience under the microphone to participate in the interaction by sending text messages.
mixed flow	users on the microphone can initiate mixing, that is, mixing multiple audio streams into a single stream. After the stream is pushed, the user under the microphone only needs to pull one stream to listen to the interactive audio of the user on the microphone, reducing the complexity of development and implementation and the performance requirements of the device.

Note: If developers want to use ZegoExpress SDK to implement more advanced audio functions, such as custom audio collection, mixing, audio spectrum and sound waves, voice change, etc., please refer to Audio Advanced。