Talk:Assembly language

This is the talk page for discussing improvements to the Assembly language article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1, 2, 3, 4: 12 months

Assembly language was nominated as a Engineering and technology good article, but it did not meet the good article criteria at the time (September 17, 2020). There are suggestions on the review page for improving the article. If you can improve it, please do; it may then be renominated.

Computer science High‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Computing: Software High‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing
High	This article has been rated as High-importance on the project's importance scale.
	This article is supported by WikiProject Software (assessed as High-importance).
	This article is supported by Early computers task force (assessed as High-importance).

Macro pseudo-ops in open code

@Wtshymanski: In several assemblers, pseudoops meant for defining macros can also be used in open code.

I added the text "

In addition, some of the assembler statements useful in macro definitions are also valid in open code, e.g., the HLASM statements
AGO

Transfer to specified assembler statement

AIF

Evaluate logical and transfer if true

GBLx

Define compile-time variables in a global context

LCLx

Define compile-time variables in a local context

SETx

Evaluate expressions and assign their values to compile time variables

" to Assembly language#Macros and Wtshymanski reverted the change, stating that it was out of place. I don't see anything wrong with either the text or its location. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:18, 19 November 2021 (UTC)[reply]

Wikipedia is not a textbook. It's probably excessivley detailed, especially in an already over-long article, to go into all the fascinating tangents. --Wtshymanski (talk) 21:38, 22 November 2021 (UTC)[reply]

@Wtshymanski: The list of pseudo-ops may be TMI, but surely the fact that they are allowed in open code belongs there. How about just "In addition, some of the assembler statements useful in macro definitions are also valid in open code"? --Shmuel (Seymour J.) Metz Username:Chatul (talk) 01:42, 23 November 2021 (UTC)[reply]

This article doesn't define "open code" so the phrase is meaningless to the reader. This reader, anyway. --Wtshymanski (talk) 03:22, 24 November 2021 (UTC)[reply]

@Wtshymanski: Surely it should, since some pseudo-ops are invalid in open code, e.g., MEXIT in HLASM. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:46, 24 November 2021 (UTC)[reply]

When it comes to defining what features work in which implementations, we leave the realm of an encyclopedai article and descent to the level of a textbook...or a programmer's manual or how-to guide. The first dozen hits on Google Books for "open code" are split between "open source" and food best-fefore dates written in plain language instead of a cipher. --Wtshymanski (talk) 19:39, 24 November 2021 (UTC)[reply]

@Wtshymanski: "open code" = "outside of macro definitions". It's a pretty well understood term among assembler programmers. Peter Flass (talk) 20:09, 24 November 2021 (UTC)[reply]

This (https://www.google.ca/books/edition/IBM_Assembler/6thQAAAAYAAJ?hl=en&gbpv=1&bsq=%22open+code%22+%22assembler%22&dq=%22open+code%22+%22assembler%22&printsec=frontcover) says "open code" is an IBM-ism, which is why I never heard of it, learning my assembler on the streets as I did. It's a pretty recondite point for a general article on assembly language, which must perforce pay attention to the world outside of IBM. --Wtshymanski (talk) 21:28, 24 November 2021 (UTC)[reply]

You’re right that google shows a lot of irrelevant results in a generic search for “open code”, but it’s not just an IBM-ism, but is regularly used when talking about macros and conditional assembly. For example, here’s one result from a book on MASM programming. [1].if the term is used here, however, it should probably be defined.Peter Flass (talk) 02:27, 25 November 2021 (UTC)[reply]

It is useful to include features which exist in assemblers for a variety of machines. In this case, the actual features are assembler variables and conditional assembly. That is, the assembler equivalent of C's #define and #ifdef. (More generally, #if.) AGO allows for loops, which might be more rare for assemblers. These are similar to the features of the PL/I preprocessors, and as well as I know, implemented in a preprocessor stage by assemblers. (That is, a temporary file is written for later processing.) The more general case should be covered here. Gah4 (talk) 23:41, 23 June 2024 (UTC)[reply]

While many simple assemblers have a separate preprocessor stage, but that is by now means universal. Specifically, the IBM assemblers Assembler H Versions 1 and 2 and High Level Assembler (HLASM) allow macro to query attributes of symbols even when they are defined later in the source code than the macro definition and invocation. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:18, 24 June 2024 (UTC)[reply]

Consistency of English articles

There is a general rule in English (from various grammar books) that the usage like "The English language" requires the definite article.

As I understand from this article, there is a family of "assembly languages" (which would require one of them to be "an assembly language", and so it is written in Simple English Wikipedia) and one "assembler language" ("the assembler language", cf. IBM). Some sources also capitalize this word.

~~Unfortunately~~ I'm not a native speaker, so I'm not sure what is correct (and I suspect that the contributors to this article are not exclusively native speakers). Probably all variants are, but maybe there should be a single variant across this article or Wikipedia? Maybe even add a section on its spelling and article (maybe to Wiktionary)?

Yaroslav Nikitenko (talk) 16:02, 13 February 2022 (UTC)[reply]

I am a native English speaker, and an electronics engineer who has written short programs in assembly languages for IBM computers and other processors, such as Intel. I have also designed integrated circuits that physically execute the associated machine instructions. I have never noticed any consistent distinction between "assembly language" and "assembler language". Jc3s5h (talk) 16:34, 13 February 2022 (UTC)[reply]

There is no "the assembler language of IBM"; IBM has provided many assemblers for many different machines and with wildly varying syntax. The link that you gave is for a specific assembler, HLASM, and it does not look remotely like assemblers for other product kines, e.g., 7070 Autocoder. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:53, 14 February 2022 (UTC)[reply]

I suspect I have been wondering about this, almost as long as I knew about assembly language. And pretty much, I still don't know. Mostly I remember hearing assembly language when spoken, and maybe half and half when written. One of those many cases where the English language doesn't do what you think it should. Gah4 (talk) 02:46, 30 November 2023 (UTC)[reply]

In my four decades of programming and IC-design experience, I've always heard it called "assembly language." Digital27 (talk) 03:21, 30 November 2023 (UTC)[reply]

I've been programming since 1960, and have heard a variety^[a] of terms, including assembler language and assembly language. The key point is that neither term specifies a particular language, but rather a diverse family of languages. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:56, 30 November 2023 (UTC)[reply]

Notes

^ Within a single shop the usage tended to be consistent except when there were multiple machine installed.

Assembly Language Primer For Hackers

I don't know if there is a good place to include this as a reference, but there is a short video series that is very good at explaining how assembly language works called Assembly Primer For Hackers. Hopefully there is a place to use this as a reference. Maybe in external links? -- Ubh [talk... contribs...] 05:51, 18 November 2023 (UTC)[reply]

I watched the first few minutes of the introduction and it appears to be tailored to assemblers on the Intel 32-bit x86 rather than assemblers in general. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:59, 29 November 2023 (UTC)[reply]

Support for truncated-address architecture

Some architectures, e.g., IBM System/360,^[1] UNIVAC III,^[2] have truncated addressing; an instruction does not have enough room for a full address, only an offset against a specified register. Assemblers^[3]^[4]^[5]^[a] for those architecture typically have special feature to assist in dealing with addressing, e.g., DSECT and USING for S/360 through IBM z/Architecture. I believe that the article should discuss the issue and give some illustrative examples. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:13, 20 June 2024 (UTC)[reply]

Truncated addressing seems like it goes to address mode, but yes, the assembler features needed could go here. Many RISC processors seem to have 32 bit instructions, and 32 bit addresses, which requires some way to encode an address in two instructions. That, then, usually means some assembler feature to generate such code. I would guess that a feature like DSECT isn't so unusual, but USING might be rare. Gah4 (talk) 00:26, 21 June 2024 (UTC)[reply]

Many more recent processors use PC relative addressing, which avoids the need for USING for code addresses. Addressing DSECT is a different question, and I don't know enough different assemblers to know. And I think PC relative addressing should have its own page. Gah4 (talk) 04:48, 21 June 2024 (UTC)[reply]

Describing PC-relative addressing and describing assembler support for it are two separate issues, and I believe that there is a case for both. Note that processors as far back as Atlas, GE 635 and DEC PDP-11 allowed indexing by the program counter, long before RISC architectures, e.g., IBM 801 , Power ISA. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 10:59, 21 June 2024 (UTC)[reply]

Note also that not all PC-relative addressing is general; some architectures support only PC-relative branching. SPARC, for example, has PC-relative branching with 22-bit displacements, plus a procedure-call instruction with a 30-bit displacement (if the upper 2 bits of the instruction are 01, it's a CALL instruction). Load and store instructions have either 2 register operands, or a register operand and a 13-bit offset, that are added together to generate the memory address. There's no equivalent of USING in SPARC assembler languages, but relatively little assembler-language programming is done for SPARC, unlike System/3x0. (And, for an OS that runs on S/390 and z/Architecture, related to the OSes that run on SPARC, the assembler doesn't, as far as I know, offer any equivalent to USING, as there's not much assembler-language programming done there, either - about 28 files with 4035 lines in the kernel and 91 files with 11160 lines in GNU libc.) — Preceding unsigned comment added by Guy Harris (talk • contribs) 20:13, 21 June 2024 (UTC)[reply]

Modern processors discourage mixing of code and data. Especially those with separate code and data cache, where it causes much problems. (That is, slow execution.) I am not, then, surprised that they don't supply PC relative data references. A common use of USING, along with DSECT, is named references to structure members. I suspect some assemblers have a way to do that. (That is, what would be part of a C struct.) USING used to be used for data references in the code, but, as above, that is now discouraged. Gah4 (talk) 21:46, 21 June 2024 (UTC)[reply]

Modern processors discourage mixing of code and data. Heck, the PDP-11 at least didn't encourage it, especially with separate I and D space. Unix programs tended to be built with separate code and data segments, and the code segment was often mapped shared and read-only; some DEC operating systems may have done the same. PC-relative data references were, I think, mostly auto-increment; that's how immediate operands were implemented.

...named references to structure members. I suspect some assemblers have a way to do that. Named references to structure members is independent of the size of displacements in instructions. 4BSD, at least, had a C program, genassym, which included a bunch of system headers and generated a file with a bunch of #defines for the offsets of various structure members, as part of the kernel generation and build process. The resulting assym.s file would be #included by assembler-language files that needed to refer to those structures. UN*X assemblers tended not to provide full-blown macro facilities or other assists such as DSECTs, as it was not expected that assembly language would be used except in rare cases where either 1) you needed to use specialized instructions to control the machine or 2) hand-coded some low-level routines such as memory copying, string manipulation, and language support such as larger-than-word-size integer arithmetic (e.g., 32-bit integers on a 16-bit platform or 64-bit integers on a 32-bit platform). They sometimes supported using the C preprocessor as a primitive macro facility, but that's about it. Guy Harris (talk) 22:23, 21 June 2024 (UTC)[reply]

Actually, HLASM runs on zLinux, although probably not Solaris; it's a gas (gd&r).

Somewhat perversely, z/Architecture has a 16-bit relative and a 32-bit relative long; IMHO a 64 KiB single code section is much too large, to say nothing of 4 GiB code sections. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 00:02, 22 June 2024 (UTC)[reply]

I suspect I have been surprised by many features added to z/Architecture by now. But some relative branches can be resolved at link time, and so branch more than a single code section. If the linkage editor still has the ability to relink previously linked code, it will need to be able to unresolve such branches. As far as I know, Java still has a 64K byte limit for a single method. Gah4 (talk) 01:10, 22 June 2024 (UTC)[reply]

I doubt that the linkage editor can handle branch relative berween CSECTs, but possibly the Binder can. I'll have to check the Program Management manual. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:10, 23 June 2024 (UTC)[reply]

Actually, HLASM runs on zLinux, although probably not Solaris ...because OpenSolaris for System z was discontinued; otherwise, IBM might have ported it.

Was HLASM ported to Linux to support moving existing assembler-language code to Linux? Guy Harris (talk) 06:22, 22 June 2024 (UTC)[reply]

That would be my guess. -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:02, 23 June 2024 (UTC)[reply]

Could the huge range of the relative long instructions be for compiler support of monolithic C, COBOL, FORTRAN and PL/I applications? -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 04:02, 23 June 2024 (UTC)[reply]

On Linux (and, were the project to have finished, on Solaris), it would more likely have been for compiler support of large shared libraries in C/C++/etc.; for executable images, PIC is only necessary if you're building position-independent executables whose starting address can be randomized. I can't speak for z/OS (or VSE or z/TPE). Guy Harris (talk) 10:47, 23 June 2024 (UTC)[reply]

Notes

^ Use the most recent HLASM reference.

References

^ IBM System/360 Principles of Operation (PDF). Systems Reference Library (Fourth ed.). IBM. September 1968. A22-6821-7. Retrieved June 20, 2024.
^ Reference Manual - UNIVAC III General - Data Processing System (PDF). Sperry Rand Corporation. 1962. UT-2488. Retrieved June 20, 2024.
^ UNIVAC III General Reference Manual - S A L T (PDF). Sperry Rand Corporation. 1962. UP·2558. Retrieved June 20, 2024.
^ OS Assembler Language - OS Release 21 (PDF). Systems Reference Library (Twelfth ed.). IBM. April 1976. GC28-6514-11. Retrieved June 20, 2024.
^ High Level Assembler for z/OS & z/VM & z/VSE - 1.6 - Language Reference (PDF). Systems Reference Library. IBM. 2021. SC26-4940-09. Retrieved June 20, 2024.

DSECT

IBM assemblers use DSECT in about the way that C programmers use struct. That is, for computing offsets into data structures. (In the case of struct pointers, that is all the compiler does. For an actual struct, it also allocates memory.) What I am wondering now, is which other assemblers have a similar feature? This would be the case where the assembler computes the offsets, and not a series of (the equivalent of) #define. Gah4 (talk) 23:31, 23 June 2024 (UTC)[reply]

[1] Within a single shop the usage tended to be consistent except when there were multiple machine installed.

[7] Use the most recent HLASM reference.

[2] IBM System/360 Principles of Operation (PDF). Systems Reference Library (Fourth ed.). IBM. September 1968. A22-6821-7. Retrieved June 20, 2024.

[3] Reference Manual - UNIVAC III General - Data Processing System (PDF). Sperry Rand Corporation. 1962. UT-2488. Retrieved June 20, 2024.

[4] UNIVAC III General Reference Manual - S A L T (PDF). Sperry Rand Corporation. 1962. UP·2558. Retrieved June 20, 2024.

[5] OS Assembler Language - OS Release 21 (PDF). Systems Reference Library (Twelfth ed.). IBM. April 1976. GC28-6514-11. Retrieved June 20, 2024.

[6] High Level Assembler for z/OS & z/VM & z/VSE - 1.6 - Language Reference (PDF). Systems Reference Library. IBM. 2021. SC26-4940-09. Retrieved June 20, 2024.

[a]

[1]

[2]

[3]

[4]

[5]

[a]