Modify

Opened 6 years ago

Closed 5 years ago

Last modified 5 years ago

#1299 closed bug (wontfix)

Loss of LAN ports on FONERA 2n with ONKYO TX-NR414 connected

Reported by: cmford@… Owned by:
Priority: normal Milestone:
Component: fon-base-firmware Version: 2.3.7.0 (Paco)
Severity: unknown
Cc: Hardware: 2.0n (FON2300)

Description (last modified by matthijs)

With my FONERA 2n (f/w 2.3.7.0) in "bridge" or "wifi bridge" mode the following situation occurs. Also true in DHCP mode

  1. Power up Fonera, then power up NR414 - LAN led on, connection made.
  2. Power off NR414 and power it up later - LAN led off, no connection.
  3. While the NR414 is powered up transfer the Ethernet cable to another LAN port and the LED comes on and the connection is made.
  4. This process (i.e. points 2. and 3.) can be repeated until all 4 LAN ports are "dead".
  5. Power cycle the Fonera and all LAN ports become available.

Attachments (13)

SSH result 8 May 2013.JPG (60.4 KB) - added by cmford@… 6 years ago.
Showing error message when running SSH commands
1st command ONKYO on FON power cycled.txt (9.6 KB) - added by cmford@… 6 years ago.
2nd command ONKYO on FON power cycled.txt (12.3 KB) - added by cmford@… 6 years ago.
1st command ONKYO off.txt (9.5 KB) - added by cmford@… 6 years ago.
2nd command ONKYO off.txt (12.3 KB) - added by cmford@… 6 years ago.
1st command ONKYO on.txt (9.5 KB) - added by cmford@… 6 years ago.
2nd command ONKYO on.txt (12.3 KB) - added by cmford@… 6 years ago.
reg_dump.sh (425 bytes) - added by matthijs 6 years ago.
Script to dump switch registers
reg_dump 1 after power cycle.txt (2.9 KB) - added by Chris Ford <cmford@…> 6 years ago.
Registers after Fonera 2n power cycle
reg_dump 2 ONKYO 1st time on.txt (2.9 KB) - added by Chris Ford <cmford@…> 6 years ago.
Registers after ONKYO first connected and powered up
reg_dump 3 ONKYO power off.txt (2.9 KB) - added by Chris Ford <cmford@…> 6 years ago.
Registers after ONKYO 1st powered off
reg_dump 4 ONKYO 2nd power on.txt (2.9 KB) - added by Chris Ford <cmford@…> 6 years ago.
Registers after ONKYO powered back on
reg_dump 5 with ACER connected.txt (2.9 KB) - added by Chris Ford <cmford@…> 6 years ago.
Registers with Acer Laptop connected and on

Download all attachments as: .zip

Change History (38)

comment:1 Changed 6 years ago by matthijs

  • Component changed from unknown to fon-base-firmware
  • Description modified (diff)
  • Status changed from new to infoneeded
  • Summary changed from Loss of LAN ports on FONERA 2n with ONKYA TX-NR414 connected to Loss of LAN ports on FONERA 2n with ONKYO TX-NR414 connected
  • Version changed from Unknown to 2.3.7.0 (Paco)

Thanks for reporting this (quite peculiar) bug!

I fixed the list markup in your report, don't forget to use the preview button :-)

As already asked through e-mail, could you get a register dump:

  1. after a reboot of the Fonera, before powering down the ONKYO
  2. after powering down the ONKYO
  3. after powering up the ONKYO again

Please also include information about which ports on the Fonera were connected.

To make a register dump, run the following commands:

root@Fonera:~# for reg in $(seq 0 252); do switch reg r $(printf '%x' $reg); done > /tmp/reg_dump.txt
root@Fonera:~# (echo "PHY REGS"; for port in $(seq 0 1 4); do echo "PORT: ${port }"; for reg in $(seq 0 1 6); do echo "REG: ${reg}"; switch reg w c0 "00004${reg} 0${port}"; switch reg r c4; done; echo; done) >> /tmp/reg_dump.txt

Then download the /tmp/reg_dump.txt after each dump and attach them here, please.

Changed 6 years ago by cmford@…

Showing error message when running SSH commands

comment:2 Changed 6 years ago by cmford@…

I have loaded the developer firmware and tried to run the commands that you asked for. When I give the second command the response is:-

-ash: syntax error: bad substitution (see attached file)

I am probably doing something incorrectly but I am keen to assist so please advise.

Please be aware that I am not a developer, simply a reasonably IT literate user. I spent 40 years of my working life with computers and communications and so I know how complex firmware code can be and the difficulties of de-bugging it without good information. As I say, I am here to help resolve this and any other issue where I can assist.

comment:3 Changed 6 years ago by matthijs

Ah, it seems there are some extra spaces that mess up the command (probably resulting from a linewrap at some point). The first command in your screenshot has an extra space (or even newline) inside "don e", the second command was already wrong in my comment above (extra space in "${port }). Here's the fixed commands, I retested them on my own Fonera just now:

root@Fonera:~# for reg in $(seq 0 252); do switch reg r $(printf '%x' $reg); done > /tmp/reg_dump.txt
root@Fonera:~# (echo "PHY REGS"; for port in $(seq 0 1 4); do echo "PORT: ${port}"; for reg in $(seq 0 1 6); do echo "REG: ${reg}"; switch reg w c0 "00004${reg
} 0${port}"; switch reg r c4; done; echo; done) >> /tmp/reg_dump.txt

As for (not) being a developer, I'm aware of that. Any info you can provide is great, as you already pointed out issues like this are pretty hard to debug, so I'm happy your willing to help us out here.

Usually, I start out with debugging instructions that are pretty consise, and expand them if needed, but it seems I forgot to add the "If you have any questions or need more details, feel free to ask!" line here, apologies for that :-)

Thanks!

Changed 6 years ago by cmford@…

Changed 6 years ago by cmford@…

Changed 6 years ago by cmford@…

Changed 6 years ago by cmford@…

Changed 6 years ago by cmford@…

Changed 6 years ago by cmford@…

comment:4 Changed 6 years ago by cmford@…

I still had some problems with the second command as some extra spaces seem to have been generated but once I removed them I appear to have got it to work. I have attached a number of files with the results. I hope that they are what you are looking for.

During this testing the Fonera 2n was configured as a wifi bridge and the only wired connection was the ONKYO on LAN port 1.

Please let me know if there is anything else you would like me to do.

comment:5 Changed 6 years ago by matthijs

  • Status changed from infoneeded to investigate

Output looks great, thanks! I'll have a closer look in the coming days to see if I can find anything out of the ordinary.

FYI, the "1st command" files weren't needed, since the second command appends its output to the file, so the "2nd command" files contain the output of both commands. Better to have too much than too little, of course :-)

comment:6 Changed 6 years ago by matthijs

I had a closer look at this, but the registers don't give any hints. When the ONKYO is turned off, the only register change (except packet counters and the PHY access register) is to change the port state from 100Mbit full duplex up to 10Mbit half duplex down. When the ONKYO is turned on again, there are no register changes at all (again except packet counters).

This suggests the hardware doesn't see the new connection at all (OTOH, the good packet counter for the port increases a bit between the ONKYO off and ONKYO on dumps, which suggests that at least some data was exchanged after powering on the ONKYO).

In any case, it seems there is no software-visible error state or something like that which we can somehow detect and fix, which really suggests that the problem occurs in hardware. The fact that the port becomes permanently unusable suggests that the 2.0n switch or PHY hardware somehow gets into an undefined state because of something a-typical (or perhaps even outside of the specifications, dunno) the ONKYO does.

I'll see if we can somehow ask Ralink, the manufacturer of the chipset for the 2.0n, about this issue, but I'm not very optimistic that this will result in anything useful. For you, the best way forward is probably to add an extra switch between the Fonera and the ONKYO, which I think will work around the problem...

For future reference: The register dump commands I gave are a bit overzealous: They dump a doubleword (4 bytes) at every address, even though the registers really are spaced 4 bytes apart. So only the addresses divisible by four (x0, x4, x8 and xc) are useful, the others contain a mix of bytes from two different registers.

When the ONKYO is powered off, the following registers change:

@@ -123,13 +123,13 @@                                                          
 switch reg read offset=7c, value=ffffffff                                     
-switch reg read offset=80, value=91809108                                     
+switch reg read offset=80, value=81808100                                     
 switch reg read offset=84, value=0                                            
@@ -187,13 +187,13 @@                                                          
 switch reg read offset=bc, value=2000                                         
-switch reg read offset=c0, value=80001f00                                     
+switch reg read offset=c0, value=6                                            
 switch reg read offset=c4, value=0                                            
@@ -219,13 +219,13 @@                                                          
 switch reg read offset=dc, value=6a                                           
-switch reg read offset=e0, value=27602db                                      
+switch reg read offset=e0, value=56805f7                                      
 switch reg read offset=e4, value=0                                            
@@ -239,11 +239,11 @@                                                          
 switch reg read offset=f0, value=0                                            
-switch reg read offset=f4, value=1a2                                          
+switch reg read offset=f4, value=27d                                          
 switch reg read offset=f8, value=0 

When it is turned on again, the following registers change:

@@ -219,13 +219,13 @@                                                          
 switch reg read offset=dc, value=6a
-switch reg read offset=e0, value=56805f7
+switch reg read offset=e0, value=75c0aae
 switch reg read offset=e4, value=0                                            
@@ -239,10 +239,10 @@                                                          
 switch reg read offset=f0, value=0
-switch reg read offset=f4, value=27d
+switch reg read offset=f4, value=28a                                          
 switch reg read offset=f8, value=0

comment:7 follow-up: Changed 6 years ago by matthijs

One more question: when does the port become dead, when powering off the ONKYO or when powering it on? In other words, does the port still work when you power off the ONKYO, then unplug the ONKYO and try to connect another device to the same port?

comment:8 Changed 6 years ago by cmford@…

I am not at home again until Sunday and will check then.

Changed 6 years ago by matthijs

Script to dump switch registers

comment:9 Changed 6 years ago by matthijs

I did a closer review of the register dumps (instead of reviewing just the differences, I reviewed _all_ the registers). Again, I did not find anything out of the ordinary and even nothing to distinguish the dead ports from normal unconnected ports.

However, I did notice that the last set of registers (the PHY registers) were not dumped correctly, there was still some extra space in the above commands :-S I'll need another set of register dumps, but to prevent further problems, I have created a script containing the needed commands, which should prevent copy-paste errors. So, now to make a registerdump, run:

root@Fonera:~# wget http://trac.fonosfera.org/fon-ng/raw-attachment/ticket/1299/reg_dump.sh
Connecting to trac.fonosfera.org (67.192.249.56:80)
reg_dump.sh          100% |*******************************|   425  --:--:-- ETA
root@Fonera:~# sh reg_dump.sh

The parts after the # are the commands, the lines in between are the expected output. The first command downloads the script I created from this ticket (so your Fonera needs to be connected to the internet), the second runs it. Again, this creates a /tmp/reg_dump.txt with the output I need.

Could you run this script:

  1. after a reboot of the Fonera, before connecting the ONKYO
  2. after connecting the (powered on) ONKYO
  3. after powering down the ONKYO
  4. after powering up the ONKYO again
  5. after disconnecting the ONKYO and connecting another device to the same port

Additionally, please also perform the test I asked for a few comments back, but please do not mix that test with the register dumping.

Feel free to merge the generated reg_dump.txt files (they're just plaintext files), but if you do, include clear comments above each register dump to tell when the dump was made.

Thanks!

comment:10 in reply to: ↑ 7 ; follow-up: Changed 6 years ago by Chris Ford <cmford@…>

Replying to matthijs:

One more question: when does the port become dead, when powering off the ONKYO or when powering it on? In other words, does the port still work when you power off the ONKYO, then unplug the ONKYO and try to connect another device to the same port?

Very interesting! I connected the ONKYO, powered it on and then powered it off. I connected my ACER laptop and it accessed the network just fine. I then re-connected the ONKYO and the port still worked! I powered off the ONKYO and re-connected the ACER laptop. The port worked fine but when I re-connected the ONKYO and powered it up the port was dead.

comment:11 in reply to: ↑ 10 Changed 6 years ago by anonymous

Replying to Chris Ford <cmford@…>:

Very interesting! I connected the ONKYO, powered it on and then powered it off. I connected my ACER laptop and it accessed the network just fine. I then re-connected the ONKYO and the port still worked! I powered off the ONKYO and re-connected the ACER laptop. The port worked fine but when I re-connected the ONKYO and powered it up the port was dead.

So if I understand it correctly, the port becomes dead when the ONKYO powers on, but only if the network cable was already connected when powering down the ONKYO, right?

comment:12 Changed 6 years ago by Chris Ford <cmford@…>

I have done some more tests and this seems to be a reproducible situation.

  1. Power cycle the Fonera 2n to start from a known state.
  2. Connect ONKYO to LAN port and power up - network accessible.
  3. Power off ONKYO and disconnect.
  4. Connect ACER laptop (which is powered up) - network accessible.
  5. Disconnect ACER, connect ONKYO and power up - network accessible.
  6. Power off ONKYO and disconnect.
  7. Connect ACER (which is powered up) - network accessible.
  8. Disconnect ACER, connect ONKYO and power up - network NOT accessible.

I have tried this sequence several times with the same results, i.e. the first time the ACER is used between the ONKYO power down and power up the LAN port stays active, but the next time the port shuts down after the ONKYO power up sequence.

I hope that this clarifies what I have observed.

Changed 6 years ago by Chris Ford <cmford@…>

Registers after Fonera 2n power cycle

Changed 6 years ago by Chris Ford <cmford@…>

Registers after ONKYO first connected and powered up

Changed 6 years ago by Chris Ford <cmford@…>

Registers after ONKYO 1st powered off

Changed 6 years ago by Chris Ford <cmford@…>

Registers after ONKYO powered back on

Changed 6 years ago by Chris Ford <cmford@…>

Registers with Acer Laptop connected and on

comment:13 Changed 6 years ago by Chris Ford <cmford@…>

I hope the attached dumps give you the information that you need. If not just let me know what else might be useful.

comment:14 follow-up: Changed 6 years ago by matthijs

Thanks! A quick look at the register dumps shows that there are indeed some differences in the PHY registers related to autonegotiation. I haven't looked closely enough yet to draw conclusions, I'll do that next week (gotta run now).

One more question: What port did you connect the ONKYO to this time? If I'm reading the dumps correctly, that would be port 3?

Thanks!

comment:15 in reply to: ↑ 14 Changed 6 years ago by anonymous

Replying to matthijs:

Thanks! A quick look at the register dumps shows that there are indeed some differences in the PHY registers related to autonegotiation. I haven't looked closely enough yet to draw conclusions, I'll do that next week (gotta run now).

One more question: What port did you connect the ONKYO to this time? If I'm reading the dumps correctly, that would be port 3?

Thanks!

Yes port 3

comment:16 Changed 6 years ago by matthijs

I had a closer look, but without much success.

Looking at the dumps you showed, there is a difference in phy register 5, which stores the "Link Code Word" received from the ONKYO. In normal operation, this contains 41e1 (01000001 11100001), which indicates:

  • 01000000 00000001: Ack, the ONKYO succesfully received the LCW from the Fonera
  • 00000001 11100000: Supported technologies, the ONKYO supports 10BaseT, 10BaseT full duplex, 100BaseTX, 100BaseTx full duplex
  • 00000000 00000001: The protocol supported is 00001, meaning 802.3 (ethernet).

When things break, the register contains 43ff (01000011 11111111). This would mean:

  • 01000000 00000001: Ack, the ONKYO succesfully received the LCW from the Fonera
  • 00000011 11100000: Supported technologies, the ONKYO supports 100BaseT4 in addition to those mentioned above.
  • 00000000 00011111: The protocol supported is 11111, which is not defined.

This value makes no sense, since the protocol is undefined and it seems unlikely to me that the ONKYO suddenly started supported 100BaseT4, which is an old protocol.

If we assume that this register does actually reflect the actual LCW the ONKYO sends, this would mean the ONKYO violates the spec. In turn, the Fonera also breaks because the register gets fixed to this value and even when disconnecting the ONKYO and connecting your Acer, the register stays fixed to this value.

I thought that perhaps the Fonera gets confused by the 11111 (0x1f) protocol value in the LCW. However, I found a laptop capable of changing those bits using the MII registers, but when my laptop sends 41ff, the Fonera happily establishes the link. It still might be that the 100BaseT4 bit confuses the Fonera, but I couldn't reproduce this (the laptop in question didn't allow me to set that bit, since it doesn't support 100BaseT4 and my attempt to let an Arduino emulate an autonegotiation signal didn't fool the Fonera, I probably missed some implementation detail somewhere).

Of course, we can't be sure that the ONKYO actually sends this 43ff link code word, it might also be that the Fonera gets confused by something else (perhaps it sends ffff and the Fonera drops a few bits?). This is not something we can really confirm without actually attaching a scope or logic analyzer to the ethernet wires...

Summarizing: I think I've exhausted all my debug options. I'll see again if Ralink might be able to provide some info on this issue, but it's probably hard if not impossible to debug this remotely...

comment:17 Changed 6 years ago by Chris Ford <cmford@…>

Thank you for all the information (much of which is beyond my knowledge but much appreciated). I intend to raise the issue on the ONKYO support forum once my registration is complete although since I only seem to have problems with the Fonera 2n router I am not sure that I will get very far there. In the absence of any other information and on the assumption that I cannot help further at this stage, I will change my equipment configuration so that the TX-NR414 is not connected to the Fonera 2n. However if at a later stage you would like me to do further testing (of this or any other issue) do not hesitate to get in touch.

Once again, many thanks for your interest and help with this problem.

comment:18 Changed 6 years ago by matthijs

Thanks!

comment:19 Changed 5 years ago by matthijs

  • Status changed from investigate to infoneeded

Chris, apparently your ONKYO receiver also has upgradeable firmware. Are you running the latest firmware, or can you try upgrading the firmware?

This post describes a similar problem, which is solved with firmware upgrade on the ONKYO: http://onkyoproductsupport.forumotion.com/t1460-tx-nr414-ethernet-not-working

comment:20 Changed 5 years ago by cmford@…

My Onkyo TX-NR414 is running f/w version 1150-9001-0000 which I understand to be the latest. Quite early on with this problem I was asked to update to this version, which I did. the same fault existed when running the previous f/w version and this current one.

My problem is not the same as the one discussed on the ONKYO forum. In that case a wired Ethernet connection was not possible at all. In my case I can get an initial connection but when I power the ONKYO off and then on again the Fonera 2.0n LAN port is disabled until I 'power cycle' the router.

comment:21 Changed 5 years ago by matthijs

  • Status changed from infoneeded to investigate

Yeah, that's why I said "a similar" not "the same" problem :-)

In any case, thanks for checking!

I also had a look at the download page at http://www.intl.onkyo.com/support/firmware/tx-nr414.html but it seems the version number listed there isn't really the latest one (it's smaller than the one you're running).

comment:22 Changed 5 years ago by matthijs

FYI: I asked ONKYO about the firmware version on the website and they confirmed the page I linked above indeed contains the wrong version number, the most current version is the one you have, 1150-9001-0000. They're going to fix that on that page as well soon.

comment:23 Changed 5 years ago by Chris Ford

It is always interesting to see how many little issues come to light when an investigation takes place!

comment:24 Changed 5 years ago by matthijs

  • Resolution set to wontfix
  • Status changed from investigate to closed

Bad news: Edimax (who manufactured the Fonera hardware for FON) has looked into issue, but did not manage to reproduce it with a few of the A/V receivers they had available. They could not get their hands on the ONKYO receiver you were using on their local market.

Given that this issue seems to affect only this particular ONKYO model (or perhaps even only your particular unit), we've decided to not investigate further and instead invest our time elsewhere. I'd have liked to diagnose and fix this problem, but with a lot of other stuff on the TODO list, investing more time simply doesn't seem defensible (especially since you have a workaround available).

Sorry for the inconvenience, and thanks for all the feedback you have given!

comment:25 Changed 5 years ago by cmford@…

I understand completely why you have come to this conclusion and very much appreciate the time and effort that you have put into the investigation.

Kind regards,

Chris

Add Comment

Modify Ticket

Action
as closed The ticket will remain with no owner.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.