Cool VL Viewer forum

View unanswered posts | View active topics It is currently 2025-08-26 17:12:17



Reply to topic  [ 52 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
sporadic crash on file dialogue 
Author Message

Joined: 2012-06-01 15:38:17
Posts: 29
Reply with quote
Simply closing CoolVLViewer_1.26.4.36, I got this in backtrace:

#4 0x441cd1f1 in malloc_consolidate () from /lib/libc.so.6


Attachments:
2012-10-28 12:39:41
Profile

Joined: 2009-03-17 18:42:51
Posts: 6043
Reply with quote
Andabata Thor wrote:
Simply closing CoolVLViewer_1.26.4.36, I got this in backtrace:

#4 0x441cd1f1 in malloc_consolidate () from /lib/libc.so.6
That's a crash on exit, which while a bug, is completely harmless (all users settings saved, nothing lost)... The reason for the crash is hard to tell, but it's the result of the viewer having been disconnected from the grid (which is a network issue, not a viewer issue), as clearly shows the log:
Code:
2012-10-28T11:21:01Z WARNING: HTTPGetResponder::completedRaw: Worker not found for 459f2d84-52e9-b822-f0ce-557c2b872282
2012-10-28T11:21:08Z WARNING: removeRegion: RegionDump: Noyo 216.82.45.61:13000 { 256512, 255232, 0 }
2012-10-28T11:21:08Z WARNING: removeRegion: RegionDump: Cowell 216.82.35.220:13001 { 256256, 255232, 0 }
2012-10-28T11:21:08Z WARNING: removeRegion: RegionDump:  216.82.34.207:13002 { 256512, 254976, 0 }
2012-10-28T11:21:08Z WARNING: removeRegion: Agent position global { 256256, 255246, 27.9549 } agent { -256, 14.4965, 27.9549 }
2012-10-28T11:21:08Z WARNING: removeRegion: Regions visited: 2
2012-10-28T11:21:08Z WARNING: removeRegion: gFrameTimeSeconds = 887.548
2012-10-28T11:21:08Z WARNING: removeRegion: Disabling region Noyo that agent is in !
2012-10-28T11:21:08Z WARNING: LLAlertDialog: Alert: You have been logged out of Second Life:
            You have been disconnected from the region you were in.
You can still look at existing IM and chat by clicking 'View IM & Chat'. Otherwise, click 'Quit' to exit Second Life immediately.
2012-10-28T11:21:08Z INFO: disableCircuit: LLMessageSystem::disableCircuit for 216.82.45.61:13000
2012-10-28T11:21:08Z INFO: disableCircuit: Couldn't find circuit code for 216.82.45.61:13000, ignoring...
2012-10-28T11:21:08Z INFO: saveSnapshot: Saving snapshot to: /root/.secondlife/andabata_thor/screen_last.bmp
2012-10-28T11:21:09Z INFO: disconnectViewer: Disconnecting viewer!

So, it's definitely unrelated with the bug described in this thread...


2012-10-28 14:10:11
Profile WWW

Joined: 2012-08-08 17:51:35
Posts: 90
Reply with quote
Henri Beauchamp wrote:
It's not a tcmalloc(-less) issue (see the stack trace...), so I doubt very much that downgrading to 1.24.6.32 will do any difference.


Yes I can see what you mean with the stack trace, but I tend to disagree with you on the difference.

I have been running 1.24.6.32 for two full evenings now, and it's definitively more stable this far as I have not had one single file dialogue related crash yet with it.
With the latest viewer I would likely have had at least a good handful crashes by now as I have been heavy on the snapshot and file dialogues.

Mystery thickens, but apparently something changes in the way the viewer and GTK gets along. I don't blame you, it can very well be a buggy library somewhere or something else affecting the outcome but the fact remains until I can prove myself wrong; with tcmalloc it's working, without it's' not (unless something else has changed from .32 in the way it talks with GTK). :roll:


2012-10-28 22:05:09
Profile

Joined: 2009-03-17 18:42:51
Posts: 6043
Reply with quote
There is no difference whatsoever in how the viewer uses GTK+ between v1.26.4.32 and newer versions. Versions using tcmalloc can be more tolerant to double-free() or such errors that could occur in a buggy library. If you want to make a tcmalloc-less viewer more tolerant to such errors under Linux, you could add "export MALLOC_CHECK_=1" (see 'man malloc' for details) in the cool_vl_viewer wrapper script before bin/cool_vl_viewer-bin is executed...


2012-10-29 00:40:21
Profile WWW

Joined: 2012-08-08 17:51:35
Posts: 90
Reply with quote
Henri Beauchamp wrote:
you could add "export MALLOC_CHECK_=1" (see 'man malloc' for details) in the cool_vl_viewer wrapper script before bin/cool_vl_viewer-bin is executed...


I'll give that a serious try, perhaps tomorrow, and keep you updated on the results.
Would you recommend trying it with 1.26.4 branch or 1.26.5 branch, or is that equal?

For now and the record, another trouble free evening with 1.26.4.32 has passed. :roll:


2012-10-29 22:12:25
Profile

Joined: 2012-08-08 17:51:35
Posts: 90
Reply with quote
So, I've tried putting "export MALLOC_CHECK_=1" in the wrapper script before any other exports, and has run the latest viewer in the stable branch like that for two days. And while it seemed to have slowed down the crash rate it does not make it as stable as the older .32 version of the viewer.

I have also looked through the update log again, without finding any updates to any package that looks like it would have with this issue to do. :(

And for good measure I have reinstalled libfreetype, libcairo and libpango just in case as they showed up close to the crash... and guess what, I have now captured a crash log that complains about malloc somehow... Do you mind looking at it and see what it says to you? I know nothing about mallocs! ;)

It bugs me a bit too why the viewer have issues with some of the icons in the file dialogue, while I can't find another program using GTK-dialogues that shows the same issue. If only I knew where to look for answers! :roll:


Attachments:
20121101-savecrash.2.txt.gz [11.7 KiB]
Downloaded 153 times
2012-11-01 18:59:18
Profile

Joined: 2009-03-17 18:42:51
Posts: 6043
Reply with quote
Jessica Hultcrantz wrote:
So, I've tried putting "export MALLOC_CHECK_=1" in the wrapper script before any other exports, and has run the latest viewer in the stable branch like that for two days. And while it seemed to have slowed down the crash rate it does not make it as stable as the older .32 version of the viewer.

I have also looked through the update log again, without finding any updates to any package that looks like it would have with this issue to do. :(

And for good measure I have reinstalled libfreetype, libcairo and libpango just in case as they showed up close to the crash... and guess what, I have now captured a crash log that complains about malloc somehow... Do you mind looking at it and see what it says to you? I know nothing about mallocs! ;)

It bugs me a bit too why the viewer have issues with some of the icons in the file dialogue, while I can't find another program using GTK-dialogues that shows the same issue. If only I knew where to look for answers! :roll:

The crash log shows clearly that the crash doesn't come from the viewer code itself (it's deep in the trace of the many GTK+ components, way deeper after the viewer called the GTK+ file chooser).
It's something to do with your specific system, since no one else seems to complain about the same issue (and I can't reproduce such crashes here either on any of my many Linux boxes).
As for the reason why viewers using tcmalloc seem stabler on your system, this is just because tcmalloc hides the causes of the crash by ignoring offending free() calls, and doing it harder than what the system malloc()/free() does with MALLOC_CHECK_ set.
The crash is probably related to some icon memory being freed while it was never allocated (because the icon failed to load, for some reason): check your files and directory permissions as well as your quotas (are you running out of file handles while running the viewer, for example ?). You could for example try to execute the viewer as root (this will prevent any permission and quota issue) and see if the GTK+ file chooser properly loads its icons...


2012-11-01 20:24:50
Profile WWW

Joined: 2012-08-08 17:51:35
Posts: 90
Reply with quote
The crash might not come from the viewer code itself, but the viewer is to blame. I wouldn't be surprised if it's an upstream issue we have revealed.

I set up a test case as follows:
My old laptop running the same distro of Linux, but has a all but good graphic card (so it does SL in 3 FPS *lol*).
No updates done on it since mid august, so if it had been a recent library change that would likely not show on this computer. :!:
I had *lol* 1.26.0.18, 1.26.2.24 and 1.26.3.8 installed on it.
And I had the installation files for most of 1.26.4 versions saved on my desktop so I ran fresh installations into separate directories on the laptop for a bunch of 1.26.4 versions.

For each single viewer version I started it from console (from inside the appropriate directory of course, just running ./cool_cl_viewer), logged into beta grid, took a snapshot (ctrl-shift-s) and selected save to disk, while looking on the console output for any issues showing. :idea:

What I found out were that all versions up to 1.26.4.22 is fine, and shows icons on file dialogues (thus no errors). :D
Viewer versions between 1.26.4.23-1.26.4.32 shows no icons, throws those GDK/GTK errors on console but does not crash. :shock:
Recent tcmalloc-less viewers crashes when drawing file dialogue. :evil:

As the only thing that changed between these runs (except the clock) was which viewer was started, so there has to be an issue in some of the (LL?)code somewhere. ;)

Looking back at the version history, it seems you did a lot of changes between .22 and .23, with memory usage management and build systems. By any chance any risk that might have created an incompatibility with GTK here?

Here's my lib version, 2.11 > 2.8 (the new requirement), right?
Code:
GNU C Library (Debian EGLIBC 2.11.3-4) stable release version 2.11.3, by Roland McGrath et al.
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.4.5.
Compiled on a Linux 2.6.32 system on 2012-06-08.
Available extensions:
        crypt add-on version 2.1 by Michael Glad and others
        GNU Libidn by Simon Josefsson
        Native POSIX Threads Library by Ulrich Drepper et al
        BIND-8.2.3-T5B
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.

I might add that it is not an issue with permissions.
I took the path from the console error message, copied and pasted it into a simple image viewer (Geeqie in this case) and every single icon showed perfectly well.


2012-11-03 14:50:53
Profile

Joined: 2009-03-17 18:42:51
Posts: 6043
Reply with quote
I will say it a last time: this is NOT a viewer bug. The fact older versions using tcmalloc allowed to "hide" the bug (though, it is likely they still suffer from a heap corruption due to that bug and might crash later in the session because of that) does not mean newer tcmalloc-less versions are faulty !

The crash is somewhere deep inside your GTK+ (and its many components) library. The viewer just calls the GTK+ file chooser, and it does so as it always did (there has been no change whatsoever in how GTK+ is called/used by the viewer in many many months !): if the GTK+ file chooser then causes a crash, it's certainly not the viewer's fault !

I cannot help you (I can't ever reproduce this bug here, so I can't diagnose the very cause and point the culprit GTK+ library). Sorry.

If you wish, just recompile the viewer yourself and enable tcmalloc (the corresponding flag is in indra/cmake/00-Common.cmake)...


2012-11-03 15:18:32
Profile WWW

Joined: 2012-08-08 17:51:35
Posts: 90
Reply with quote
Well, explain why older viewers like 1.26.4.22 works then, please?

I told you in the last message that i found that the issue clearly starts with 1.26.4.23 and now escalates when tcmalloc is removed. No sign of it whatsoever prior to 1.26.4.23.

Please re-read my previous post again.

If it would be a pure GTK bug as you implies, wouldn't' the issue be persistent even with the really old viewers then??? And what about all other programs using the same dialogues without issues? It doesn't make sense to me!

As said, I can clearly see there is something happening between .22 and .23 now that I have tried about 15 viewer versions on the same computer, and on my systems that counts as reproducible. I just don't know how to trace the actual cause down. Sorry!!! :cry:

To be very, very clear, .22 and earlier shows NO warnings, and does NOT have any issues with the GTK-dialogues as there are NO missing icons with these versions and thus naturally no errors, while .23 and onwards obviously has because they won't pick up the icons, and produces errors. :evil:

Meh, I should have complained at once, back in august.

I will try compiling, and will let you know about the outcome, but the further this goes the more it seems something has happened with the code back in the summer.
From what I can see, a lot of prepackaged libraries changed between these two versions. What are the odds that one of these plays a role, f.ex.?


2012-11-03 16:47:44
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 52 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

Who is online

Users browsing this forum: No registered users and 182 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software.