Post subject: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Thu Sep 03, 2009 1:02 pm 
   
#9449: Laptop reboots on first resume
--------------------------+-------------------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Keywords: | Next_action: diagnose
Verified: 1 | Deployment_affected:
Blockedby: | Blocking:
--------------------------+-------------------------------------------------
On a XO-1.5 laptop, the first attempt at suspending/resuming frequently
results in a reboot instead. Subsequent suspend/resumes work fine
(hundreds of thousands of cycles with no reboot or hang).

I've verified this on a B1 laptop, running Q3A10 firmware. It has
certainly been present in earlier phase laptops and earlier versions of
firmware.

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Thu Sep 03, 2009 1:18 pm 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wmb@firmworks.com):

I wonder if it happens when you suspend/resume from Linux too, or only
with the OFW command.

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Thu Sep 03, 2009 1:40 pm 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wmb@firmworks.com):

Some questions:

a) Which command or commands are used for the test? "s"? "rtc-wackup"?
others?

b) What is the power source for a circumstance that you are sure has
failed? AC or battery or both?

c) What exactly do you mean by "first attempt"? Is it the first try after
a power-button turn-off, or a full power removal, or a "bye" from OFW, or
a "power-off" from OFW, or a reboot from Linux, or a poweroff from Linux?

d) What is the approximate failure percentage on a system that fails
readily?

The reason I ask is because I can't seem to make my B1 fail at the moment.
I have seen this problem before, so I'm not disputing its existence, but
until I can work out how to make it fail more frequently on my machine or
get access to a system that fails readily, it will be very difficult for
me to make any progress.

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Sat Sep 05, 2009 1:14 pm 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wad):

Additional clarification on the reboot:

I saw it happen on all B1 prototypes tested. It makes no difference to
the frequency if the laptop is plugged into DC power or running off
battery. It makes no difference if the EC has been rebooted as the same
time as the system.

The source of the resume doesn't make a difference. I also see reboots on
first resumes that happen automatically due to a battery and touchpad SCI
events.

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Sat Sep 05, 2009 1:24 pm 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wad):

In response to Mitch's questions:

a) This problem occurs regardless of the command. It has been seen with
wackup-test-ec and autowack-test as well as "s". All testing was done
using the "s" command once this was realized.

b) Any power source

c) The first suspend attempt after an OFW boot. Nothing else seems to
make a difference.

d) On #20, I saw 4 reboots and 15 wakeups on the first attempt. On #18, I
saw 2 reboots and 5 wakeups. On #7, I saw 3 reboots and 4 wakeups. On
#2, I saw 1 reboot and 6 wakeups. I would guess the mean is 25% ?

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Sat Sep 05, 2009 3:58 pm 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wmb@firmworks.com):

Try dev.laptop.org:~wmb/q3a10e.rom

It sets a bit that prevents the PLL from being turned off in suspend
state. I found a sentence in the Bios Porting Notes document (neither the
PM_VX855 nor the PG_VX855 manual, yet a third document) that says that bit
must be set before suspend. Apparently they mean it.

I haven't experience a resume crash since setting that bit, in quite a few
tries on my A2 board that used to fail fairly frequently.

By the way, one more data point - I dumped the memory controller registers
at the failure point and found that they were at their default values
instead of holding the previously-set values (which they do on a
successful resume).

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Sun Sep 06, 2009 6:54 am 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: new
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------

Comment(by wmb@firmworks.com):

The fix reflected in q3a10e greatly reduces the failure frequency but does
not eliminate it. The syndrome of the less-frequent failures is slightly
different - the memory is not completely dead, but has bit-rot.

The remaining problem is that DMA activity from the USB sometimes kicks
the RAMs out of self-refresh. Turning off the USB controller seems to fix
the problem. My latest test OFW has run several hundred
reboot/suspend/resume cycles without a resume reset.

This explains why the failure only happened on the first suspend. After
that, the USB controller was off (OFW doesn't turn it back on after
resume).

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 
 Post subject: Re: #9449 NORM 1.5-CTe: Laptop reboots on first resume
PostPosted: Thu Oct 01, 2009 11:09 am 
#9449: Laptop reboots on first resume
------------------------------------+---------------------------------------
Reporter: wad | Owner: wad
Type: defect | Status: closed
Priority: normal | Milestone: 1.5-CTest
Component: not assigned | Version: not specified
Resolution: fixed | Keywords:
Next_action: diagnose | Verified: 1
Deployment_affected: | Blockedby:
Blocking: |
------------------------------------+---------------------------------------
Changes (by wad):

* status: new => closed
* resolution: => fixed


Comment:

This seems to be fixed, or greatly alleviated, by Q3A11 or later firmware.

--
Ticket URL:
One Laptop Per Child
OLPC bug tracking system


 





SitemapIndex SitemapIndex RSS Feed RSS Feed Channel list Channel list
 © 0x61.com 2009 - Internet Forums and much more! - All rights reserved.