Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set WebRTC video codec H.264 profile-level-id to 42e01f to be compatible with Firefox clients #109

Merged
merged 1 commit into from Dec 5, 2021

Conversation

mooons
Copy link
Contributor

@mooons mooons commented Dec 5, 2021

Problem

With NEKO_VP8: 'false', NEKO_VP9: 'false' and NEKO_H264: 'true' under docker-compose.yaml's environment tag, the video stream would be encoded in H.264 for WebRTC.

But in server/internal/webrtc/webrtc.go, with the current profile-level-id=42001f, Firefox 94.0.2 (on the client-side) won't accept the WebRTC video stream:

https://github.com/m1k1o/neko/blame/c97b1fc4541caabf6b00331d081b02d2f9c58751/server/internal/webrtc/webrtc.go#L318

The symptom is that Firefox client displays a "Disconnected - connection timeout" message (after waiting for 15s) on the top left corner when logging in (despite entering the correct password). (Chrome works just fine btw.)

timeout

In the mean time, the following is found in the container log:

neko_1  | 2021-12-05 10:04:09,952 DEBG 'neko' stdout output:
neko_1  | 10:04AM ERR message handler has failed error="signal/answer failed: unable to start track, codec is not supported by remote" module=websocket
neko_1  | 
neko_1  | 2021-12-05 10:04:24,715 DEBG 'neko' stdout output:
neko_1  | 10:04AM WRN read message error error="websocket: close 1005 (no status)" module=websocket

Also in Firefox about:webrtc, it shows only the audio track is accepted (recvonly) but the video track is not (inactive). This should further indicate the video (codec) provided by the server is not accepted:

Local SDP (Answer)

...
m=video 0 UDP/TLS/RTP/SAVPF 120
c=IN IP4 0.0.0.0
a=inactive
a=mid:0
a=rtpmap:120 VP8/90000
m=audio 9 UDP/TLS/RTP/SAVPF 111
c=IN IP4 0.0.0.0
a=recvonly
a=fmtp:111 maxplaybackrate=48000;stereo=1;useinbandfec=1

For reference, in Remote SDP (Offer), we can see that H.264 profile-level-id=42001f is offered by the neko server in this case (but not accepted by the Firefox client):

Remote SDP (Offer)

...
m=video 9 UDP/TLS/RTP/SAVPF 102
c=IN IP4 0.0.0.0
a=candidate:821511142 1 udp 2130706431 173.230.155.236 52095 typ host
a=sendrecv
a=extmap:1 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=fmtp:102 profile-level-id=42001f;level-asymmetry-allowed=1;packetization-mode=1
a=ice-pwd:qkAMOCulBlUBpsYWAHyjHvmOrjNkEKEF
a=ice-ufrag:jOGnbfDDElFSztKI
a=mid:0
a=msid:stream video
a=rtcp-fb:102 nack
a=rtcp-fb:102 nack pli
a=rtcp-fb:102 transport-cc
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:102 H264/90000
a=setup:actpass
a=ssrc:1539325104 cname:stream
a=ssrc:1539325104 msid:stream video
a=ssrc:1539325104 mslabel:stream
a=ssrc:1539325104 label:video
m=audio 9 UDP/TLS/RTP/SAVPF 111
c=IN IP4 0.0.0.0
a=sendrecv
a=extmap:1 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=ice-pwd:qkAMOCulBlUBpsYWAHyjHvmOrjNkEKEF
a=ice-ufrag:jOGnbfDDElFSztKI
a=mid:1
a=msid:stream audio
a=rtcp-fb:111 transport-cc
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:111 opus/48000/2
a=setup:actpass
a=ssrc:814939082 cname:stream
a=ssrc:814939082 msid:stream audio
a=ssrc:814939082 mslabel:stream
a=ssrc:814939082 label:audio
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=sendrecv

The Fix

I simply changed the profile-level-id from 42001f to 42e01f. Now it works flawlessly in both Firefox and Chrome.

(Tested with ./build then ./build firefox under .docker. Reployed the neko container with docker-compose up.)

Now Firefox client's about:webrtc shows that both video and audio tracks are accepted:

Local SDP (Answer)

...
m=video 9 UDP/TLS/RTP/SAVPF 102
c=IN IP4 0.0.0.0
a=candidate:0 1 UDP 2122252543 88cfb9cf-5f54-7e4b-8c60-b79347c1e99a.local 52106 typ host
a=candidate:1 1 UDP 2122187007 c1e576e0-ece0-b948-a729-f2bbfc0e9aa1.local 55440 typ host
a=candidate:2 1 TCP 2105524479 88cfb9cf-5f54-7e4b-8c60-b79347c1e99a.local 9 typ host tcptype active
a=candidate:3 1 TCP 2105458943 c1e576e0-ece0-b948-a729-f2bbfc0e9aa1.local 9 typ host tcptype active
a=recvonly
a=end-of-candidates
a=extmap:1 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=fmtp:102 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1
a=ice-pwd:65345237c33e8c43e85504904fcc628a
a=ice-ufrag:f6ed818c
a=mid:0
a=rtcp-fb:102 nack
a=rtcp-fb:102 nack pli
a=rtcp-fb:102 transport-cc
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:102 H264/90000
a=setup:active
a=ssrc:718473048 cname:{c34aa897-7975-544c-b53a-54d3c8439e6f}
m=audio 9 UDP/TLS/RTP/SAVPF 111
c=IN IP4 0.0.0.0
a=recvonly

Note that Chrome (96.0.4664.55)'s WebRTC implementation would accept H.264 stream with either profile-level-id=42e01f or profile-level-id=42001f without a problem. Maybe Chrome works with both profiles, or it just doesn't care about such (constraint flag) difference in profile-level-id.

What's the catch (of changing profile-level-id in this case)?

tl;dr: It should be fine. If any compatibility issues (with the decoder) are discovered on some other devices down the road, we could just explicit set the encoder profile to Constrained Baseline on the server side.

profile-level-id=42001f indicates a Baseline profile, while 42e01f indicates a Constrained Baseline profile. (See the section below for how this is deducted.)

According to Wikipedia:

[Baseline Profile] includes all features that are supported in the Constrained Baseline Profile.

42e01f working but 42001f not working for the Firefox client should indicate that the Firefox WebRTC implementation only supports Constrained Baseline Profile (for whatever reason). (But the video decoder it eventually uses may well support both Constrained Baseline Profile and Baseline Profile, or even Main, Extended and High, which I haven't looked deep into.)

On the encoder (server) side, as far as I have looked into the code base, neko uses x264enc when openh264 plugin is not installed. And indeed it uses x264enc in my testing. From the container log again:

neko_1  | 2021-12-05 07:16:39,719 DEBG 'neko' stdout output:
neko_1  | 7:16AM INF Pipelines starting... audio_codec=Opus audio_device=auto_null.monitor audio_pipeline_src="pulsesrc device=auto_null.monitor ! audio/x-raw,channels=2 ! audioconvert ! opusenc bitrate=196000 ! appsink name=appsink" module=remote screen_resolution=1920x1080@60 video_codec=H264 video_display=:99.0 video_pipeline_src="ximagesrc display-name=:99.0 show-pointer=true use-damage=false ! video/x-raw,framerate=25/1 ! videoconvert ! queue ! video/x-raw,format=NV12 ! x264enc threads=4 bitrate=3072 key-int-max=60 vbv-buf-capacity=3072 byte-stream=true tune=zerolatency speed-preset=veryfast ! video/x-h264,stream-format=byte-stream ! appsink name=appsink"

The profile isn't explicitly specified in the command line. I would guess either Baseline or Constrained. I tried to determine the actual profile the encoder uses. But no luck so far. (Skimmed the x264enc doc, tried chrome://media-internals/, WireShark capture RTP stream).

Bonus: What does profile-level-id=42e01f even mean?

According to RFC 6184: RTP Payload Format for H.264 Video

   OPTIONAL parameters:

      profile-level-id:
         A base16 [7] (hexadecimal) representation of the following
         three bytes in the sequence parameter set NAL unit is specified
         in [1]: 1) profile_idc, 2) a byte herein referred to as
         profile-iop, composed of the values of constraint_set0_flag,
         constraint_set1_flag, constraint_set2_flag,
         constraint_set3_flag, constraint_set4_flag,
         constraint_set5_flag, and reserved_zero_2bits in bit-
         significance order, starting from the most-significant bit, and
         3) level_idc.  Note that reserved_zero_2bits is required to be
         equal to 0 in [1], but other values for it may be specified in
         the future by ITU-T or ISO/IEC.

So profile-level-id=42e01f would translate into:

profile_idc=0x42 -> decimal 66
level_idc=0x1f   -> decimal 31
profile-iop=0xe0 -> binary 11100000
constraint_set0_flag=1 -> indicates that the coded video sequence obeys all constraints specified in clause A.2.1 [2].

constraint_set1_flag=1 -> indicates that the coded video sequence obeys all constraints specified in clause A.2.2 [2]. This flag=1 and profile_idc=0x42 matches CB (Constrained Baseline profile) in Table 5.

constraint_set2_flag=1 -> indicates that the coded video sequence obeys all constraints specified in clause A.2.3 [2].

constraint_set3_flag=0 -> ignored by decoder when profile_idc=66 and level_idc!=11 [2]

Table 5 [2]:

              Profile     profile_idc        profile-iop
                          (hexadecimal)      (binary)

              CB          42 (B)             x1xx0000
                 same as: 4D (M)             1xxx0000
                 same as: 58 (E)             11xx0000
              B           42 (B)             x0xx0000
                 same as: 58 (E)             10xx0000
              M           4D (M)             0x0x0000
...

References

  1. https://stackoverflow.com/q/22960928

  2. Section 7.4.2.1.1 "Sequence parameter set data semantics" in https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-H.264-201602-S!!PDF-E&type=items

Please kindly review. Thanks :)

@m1k1o
Copy link
Owner

m1k1o commented Dec 5, 2021

Thank you for helping out with this and your detailed analysis on this topic. I was not aware, where the incompatibility for H264 in Firefox comes from.

I agree that profile and level should be explicitly set in the encoding pipeline so that it matches payload format in WebRTC.

@m1k1o m1k1o merged commit 3a61d3a into m1k1o:master Dec 5, 2021
@m1k1o m1k1o mentioned this pull request Dec 5, 2021
m1k1o added a commit that referenced this pull request Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants