Discussion:
DWARF output, type of char[]?
Cristian Vlasceanu
2007-04-02 23:42:06 UTC
Permalink
What is the DWARF type of char[], as output by the DMD backend?

I have compiled this sample:

import std.stdio;

int main()
{
char[] greet = "Hello";

writefln(greet);
return 0;
}

Then ran
readelf -w hello

As shown in this snippet, the type of greet is unsigned long long (entry at offset <c2>):

<1><c2>: Abbrev Number: 2 (DW_TAG_base_type)
DW_AT_name : unsigned long long
DW_AT_byte_size : 8
DW_AT_encoding : 7 (unsigned)
<1><d8>: Abbrev Number: 4 (DW_TAG_subprogram)
DW_AT_sibling : <104>
DW_AT_name : _Dmain
DW_AT_decl_file : 1
DW_AT_decl_line : 3
DW_AT_type : <bb>
DW_AT_low_pc : 0x804a358
DW_AT_high_pc : 0x804a383
DW_AT_frame_base : 1 byte block: 55 (DW_OP_reg5)
<2><f5>: Abbrev Number: 3 (DW_TAG_variable)
DW_AT_name : greet
DW_AT_type : <c2>
DW_AT_location : 2 byte block: 91 78 (DW_OP_fbreg: -8)
Cristian Vlasceanu
2007-04-02 23:49:58 UTC
Permalink
What is the DWARF type of char[], as output by the DMD backend?

I have compiled this sample:

import std.stdio;

int main()
{
char[] greet = "Hello";

writefln(greet);
return 0;
}

Then ran
readelf -w hello

As shown in this snippet, the type of greet is unsigned long long (entry at offset <c2>):

<1><c2>: Abbrev Number: 2 (DW_TAG_base_type)
DW_AT_name : unsigned long long
DW_AT_byte_size : 8
DW_AT_encoding : 7 (unsigned)
<1><d8>: Abbrev Number: 4 (DW_TAG_subprogram)
DW_AT_sibling : <104>
DW_AT_name : _Dmain
DW_AT_decl_file : 1
DW_AT_decl_line : 3
DW_AT_type : <bb>
DW_AT_low_pc : 0x804a358
DW_AT_high_pc : 0x804a383
DW_AT_frame_base : 1 byte block: 55 (DW_OP_reg5)
<2><f5>: Abbrev Number: 3 (DW_TAG_variable)
DW_AT_name : greet
DW_AT_type : <c2>
DW_AT_location : 2 byte block: 91 78 (DW_OP_fbreg: -8)
BCS
2007-04-03 00:35:14 UTC
Permalink
Reply to Cristian,
Post by Cristian Vlasceanu
What is the DWARF type of char[], as output by the DMD backend?
I've wondered at this kind of thing to. Why isn't it some sort of ptr/length
pair struct? Name it something remotely readable and it will be a ton better
than unsigned long long. The expression that gets the content of a char[]
as something gdb can read is several inches long.
Cristian Vlasceanu
2007-04-03 01:40:57 UTC
Permalink
Post by BCS
Reply to Cristian,
Post by Cristian Vlasceanu
What is the DWARF type of char[], as output by the DMD backend?
The expression that gets the content of a char[]
as something gdb can read is several inches long.
I am not sure that I understand this? Is there is a way of transforming the unsigned long long into something meaningful?

Zero treats all types as if the source were C/C++.

I am working on allowing the user to customize the way variables are being displayed (via Python scripts). I am doing it mainly for being able to display C++/STL containers in a sane way, but I think that with a little work it may do the trick for D types as well.

Cristian
BCS
2007-04-03 16:20:39 UTC
Permalink
Post by Cristian Vlasceanu
Post by BCS
Reply to Cristian,
Post by Cristian Vlasceanu
What is the DWARF type of char[], as output by the DMD backend?
The expression that gets the content of a char[]
as something gdb can read is several inches long.
I am not sure that I understand this? Is there is a way of transforming the unsigned long long into something meaningful?
yes

given T[] arr;

the contents are: *(cast(T*)(cast(void*)(&arr)[1])
the length is: *cast(size_t*)(&arr)

and that's if I remember correctly. It's ugly and messy and a total pain
in the whatever. But it works.

However if an array of type T[] were to claim to be

struct
{
T* ptr;
size_t length;
}

or something close to that, then you could access it almost as normal.
Post by Cristian Vlasceanu
Zero treats all types as if the source were C/C++.
I am working on allowing the user to customize the way variables are being displayed (via Python scripts). I am doing it mainly for being able to display C++/STL containers in a sane way, but I think that with a little work it may do the trick for D types as well.
Cristian
Cristian Vlasceanu
2007-04-03 16:53:18 UTC
Permalink
Post by BCS
Post by Cristian Vlasceanu
Post by BCS
Reply to Cristian,
Post by Cristian Vlasceanu
What is the DWARF type of char[], as output by the DMD backend?
The expression that gets the content of a char[] as something gdb can
read is several inches long.
I am not sure that I understand this? Is there is a way of
transforming the unsigned long long into something meaningful?
yes
given T[] arr;
the contents are: *(cast(T*)(cast(void*)(&arr)[1])
the length is: *cast(size_t*)(&arr)
and that's if I remember correctly. It's ugly and messy and a total pain
in the whatever. But it works.
However if an array of type T[] were to claim to be
struct
{
T* ptr;
size_t length;
}
or something close to that, then you could access it almost as normal.
I do not know enough D, but is there "unsigned long long" an otherwise
legal D type? Because if (as I supsect) it isn't, and it can be
unequivocally determined that it stands for a char[] whenever the
translation unit's language is D, I could easily make it work in Zero.
Out of the box.
Frits van Bommel
2007-04-03 17:55:41 UTC
Permalink
Post by Cristian Vlasceanu
I do not know enough D, but is there "unsigned long long" an otherwise
legal D type? Because if (as I supsect) it isn't, and it can be
unequivocally determined that it stands for a char[] whenever the
translation unit's language is D, I could easily make it work in Zero.
Out of the box.
You may want to check what type "ulong" is marked as. Since that's a
64-bit unsigned integer (like unsigned long long typically is) it may
also be marked as "unsigned long long"...
BCS
2007-04-03 19:09:44 UTC
Permalink
Post by Frits van Bommel
Post by Cristian Vlasceanu
I do not know enough D, but is there "unsigned long long" an otherwise
legal D type? Because if (as I supsect) it isn't, and it can be
unequivocally determined that it stands for a char[] whenever the
translation unit's language is D, I could easily make it work in Zero.
Out of the box.
You may want to check what type "ulong" is marked as. Since that's a
64-bit unsigned integer (like unsigned long long typically is) it may
also be marked as "unsigned long long"...
What is needed is for T[] to be marked as something special. Then you
could treat it as described. FWIW D has it's own DWARF "language number"
so we should be able to add new types. Then the debugger could treat it
as said struct.

A short term hack could be to have the compiler do the magic and mark it
as some sort of auto generated struct type.

Disclaimer: I know slightly more than 0 about how DWARF works under the
hood.
Cristian Vlasceanu
2007-04-03 19:44:10 UTC
Permalink
Post by BCS
Post by Frits van Bommel
Post by Cristian Vlasceanu
I do not know enough D, but is there "unsigned long long" an otherwise
legal D type? Because if (as I supsect) it isn't, and it can be
unequivocally determined that it stands for a char[] whenever the
translation unit's language is D, I could easily make it work in Zero.
Out of the box.
You may want to check what type "ulong" is marked as. Since that's a
64-bit unsigned integer (like unsigned long long typically is) it may
also be marked as "unsigned long long"...
What is needed is for T[] to be marked as something special. Then you
could treat it as described. FWIW D has it's own DWARF "language number"
so we should be able to add new types. Then the debugger could treat it
as said struct.
A short term hack could be to have the compiler do the magic and mark it
as some sort of auto generated struct type.
Disclaimer: I know slightly more than 0 about how DWARF works under the
hood.
I have verified this, with 1.010, ulong and char[] are indistinguishable in the DWARF.

My personal preference is to have it represented as a struct.
Frits van Bommel
2007-04-03 20:54:44 UTC
Permalink
Post by Cristian Vlasceanu
I have verified this, with 1.010, ulong and char[] are indistinguishable in the DWARF.
My personal preference is to have it represented as a struct.
A struct doesn't tell you how much of the data at .ptr is valid.
IMHO, for '-g' (native-D debugging info) it should be a special type.
(Assuming DWARF supports language-specific types, is that the case?)
That way, D-aware debuggers can for example pretty-print char[] strings
as regular strings instead of (size_t, char*) pairs.
For '-gc' (pretends-to-be-C debugging info) a struct should probably be
used though.
BCS
2007-04-03 21:02:11 UTC
Permalink
Reply to Frits,
Post by Frits van Bommel
Post by Cristian Vlasceanu
I have verified this, with 1.010, ulong and char[] are
indistinguishable in the DWARF.
My personal preference is to have it represented as a struct.
A struct doesn't tell you how much of the data at .ptr is valid.
IMHO, for '-g' (native-D debugging info) it should be a special type.
(Assuming DWARF supports language-specific types, is that the case?)
I think so based on this: http://dwarfstd.org/Dwarf3.pdf
Post by Frits van Bommel
That way, D-aware debuggers can for example pretty-print char[] strings
as regular strings instead of (size_t, char*) pairs.
For '-gc' (pretends-to-be-C debugging info) a struct should probably
be used though.
That's about what I'm thinking.
Cristian Vlasceanu
2007-04-04 00:17:07 UTC
Permalink
Post by Frits van Bommel
Post by Cristian Vlasceanu
I have verified this, with 1.010, ulong and char[] are indistinguishable in the DWARF.
My personal preference is to have it represented as a struct.
A struct doesn't tell you how much of the data at .ptr is valid.
IMHO, for '-g' (native-D debugging info) it should be a special type.
(Assuming DWARF supports language-specific types, is that the case?)
That way, D-aware debuggers can for example pretty-print char[] strings
as regular strings instead of (size_t, char*) pairs.
For '-gc' (pretends-to-be-C debugging info) a struct should probably be
used though.
How about adding a "DWARF vendor extension":

add a new tag, DW_TAG_array.
Array "classes" will have a DW_TAG_array corresponding entry in the debug info,
each with children entries such as:
DW_AT_type // DWARF std, points to the element type
... // TBD


array instances will be represented by entries that have children of the following types:
DW_AT_capacity // D extension
DW_AT_size // DWARF standard
DW_AT_value // address of 1st elem, DWARF standard
DW_AT_type // DWARF std, would point to an entry of DW_TAG_array type

thoughts?
BCS
2007-04-04 01:01:10 UTC
Permalink
Reply to Cristian,
Post by Cristian Vlasceanu
Post by Frits van Bommel
Post by Cristian Vlasceanu
I have verified this, with 1.010, ulong and char[] are
indistinguishable in the DWARF.
My personal preference is to have it represented as a struct.
A struct doesn't tell you how much of the data at .ptr is valid.
IMHO, for '-g' (native-D debugging info) it should be a special type.
(Assuming DWARF supports language-specific types, is that the case?)
That way, D-aware debuggers can for example pretty-print char[] strings
as regular strings instead of (size_t, char*) pairs.
For '-gc' (pretends-to-be-C debugging info) a struct should probably be
used though.
add a new tag, DW_TAG_array.
Array "classes" will have a DW_TAG_array corresponding entry in the debug info,
DW_AT_type // DWARF std, points to the element type
... // TBD
DW_AT_capacity // D extension
DW_AT_size // DWARF standard
DW_AT_value // address of 1st elem, DWARF standard
DW_AT_type // DWARF std, would point to an entry of
DW_TAG_array type
thoughts?
Sounds good, but I'm no DWARF expert.

Walter?

Loading...