Showing posts with label genetic code. Show all posts
Showing posts with label genetic code. Show all posts

Tuesday, May 13, 2014

Anaerobic Catalases and Codon Pseudodegeneracy

Today I want to demolish two myths, one being that strict anaerobes (such as members of the Clostridia, a class of bacteria that includes the botulism organism as well as the C. diff bug that sickens half a million people a year in the U.S.) lack a catalase enzyme, and the other being that codons have very little degeneracy in base one. Don't worry, I'll parse all this for you so that even if you're not a bio-geek, you'll get it.

Bacteriology students are taught from Day One that strict anaerobes are killed by oxygen because they lack a catalase enzyme. Catalase is the nearly universal enzyme that breaks down peroxides into oxygen and water. The catechism about anaerobes being catalase-negative is true for some anaerobes, but by no means all. In fact, if you go to UniProt.org and do a search on "catalase clostridium" you'll get a long list of hits representing anaerobes that do, in fact, have a perfectly normal catalase enzyme. Some anaerobes have the classic katG (E. coli) version of catalase while others (like C. difficile) have a special manganese-containing catalase. Pathogenic anaerobes probably use the enzyme to combat the respiratory burst that accompanies engulfment by white blood cells. I've written about anaerobic  catalases before; for an overview of the enzymology and phylogenetic breakdown by enzyme type, start here.

Methanospirillum (a strict anaerobe found
in sewage) produces colonies with
characteristic striations spaced two
cell-lengths apart.
I thought it would be interesting to compare the katG genes of two anaerobes to see how they differ. I decided to focus on C. botulinum B (the nasty food-poisoning organism) and the archeon Methanospirillum hungatei strain JF-1, which is named after a professor I had in grad school, Bob Hungate, a true pioneer in the study of anaerobic bacteria.

C. botulinum and M. hungatei are worlds apart, taxonomically. The genome of the former has a G+C (guanine, cytosine) content of just 27%, while the latter has G+C of 45%. Comparing their catalases, I found 717 nucleotide differences between the two genes (which were 2193 bases long in one case and 2157 in the other case). On the surface, the genes seem (from a DNA sequence point of view) quite far apart phylogenetically. But after performing an alignment in Mega6, I was able to study the codons base by base and found that of 717 differences, 500 changes were synonoymous (resulting in no amino acid substitution) while only 217 changes were non-synonymous. Not unexpectedly, most of the synonymous changes were due to differences in the third codon base (where the genetic code is highly degenerate; see below).

Codon Usage in Clostridium botulinum B
Codon Usage: The Standard Code (transl_table=1)
CCA(P) 1.45%
CCG(P) 0.10%
CCT(P) 1.00%
CCC(P) 0.06%

CGA(R) 0.08%
CGG(R) 0.01%
CGT(R) 0.18%
CGC(R) 0.02%

CAA(Q) 2.01%
CAG(Q) 0.25%
CAT(H) 1.10%
CAC(H) 0.16%

CTA(L) 0.77%
CTG(L) 0.10%
CTT(L) 1.67%
CTC(L) 0.06%
GCA(A) 2.66%
GCG(A) 0.20%
GCT(A) 2.18%
GCC(A) 0.19%

GGA(G) 3.30%
GGG(G) 0.47%
GGT(G) 2.08%
GGC(G) 0.40%

GAA(E) 6.40%
GAG(E) 1.11%
GAT(D) 5.12%
GAC(D) 0.62%

GTA(V) 2.79%
GTG(V) 0.45%
GTT(V) 2.81%
GTC(V) 0.14%
ACA(T) 2.36%
ACG(T) 0.16%
ACT(T) 2.22%
ACC(T) 0.19%

AGA(R) 2.43%
AGG(R) 0.31%
AGT(S) 1.89%
AGC(S) 0.48%

AAA(K) 7.10%
AAG(K) 2.17%
AAT(N) 6.06%
AAC(N) 0.89%

ATA(I) 5.68%
ATG(M) 2.50%
ATT(I) 4.01%
ATC(I) 0.42%
TCA(S) 2.10%
TCG(S) 0.12%
TCT(S) 1.68%
TCC(S) 0.12%

TGA(*) 0.02%
TGG(W) 0.69%
TGT(C) 1.02%
TGC(C) 0.25%

TAA(*) 0.24%
TAG(*) 0.07%
TAT(Y) 3.61%
TAC(Y) 0.55%

TTA(L) 5.68%
TTG(L) 0.74%
TTT(F) 3.74%
TTC(F) 0.57%
Pink-highlighted codons are examples of where the genetic code is degenerate for base one. 
A change of CGA to AGA (for example) results in no amino acid change: the 
coded-for amino acid is arginine in either case.


"Degeneracy" means a change in a particular base results in no change of amino acid. An example is the codon AAA (see above), which is the code for lysine (K). Changing the last base to G (resulting in a codon of AAG) still produces lysine in the resulting protein. Codons that begin with 'AA' are two-fold degenerate, because two versions (AAA and AAG) produce the same amino acid. A codon that begins with 'CC' is said to be four-fold degenerate, because any of four different endings (CCA, CCT, CCG, CCC) produce the same amino acid (P, proline).

What's less obvious is that a change in the first (rather than the third) base of a codon can produce synonyms as well. For example, changing CGA to AGA results in the same amino acid: arginine (R). Likewise, CTA changing to TTA still results in leucine (L).

But there's more degeneracy to  the first codon base than just the classic cases outlined in pink in the above table. A change of CTA to ATA produces a switch from leucine to isoleucine, which is tantamount to no change at all, because leucine and isoleucine are isomers (both have the same chemical formula; they differ slightly in arrangement).


Likewise, a change from CTx to GTx produces a change from leucine to valine, which are workalikes. One could even argue that a change from CTT or CTC to TTT or TTC (leucine to phenylalanine) is functionally a degenerate change, because leucine and phenylalanine are nonpolar, hydrophobic amino acids with similar chemical properties.

So in fact, many changes to the first base of a codon can be considered pseudodegenerate, if you will, by virtue of producing non-synonymous changes that are tantamount to very little actual chemical change. We would expect to see a significant number of such changes in mutations affecting base one of codons. And we do.

In the two catalase genes, I looked at changes involving just the first base of a codon and found the following changes:

Synonymous:
AGA<-->CGA (R:R)
AGA<-->CGA (R:R)
AGA<-->CGA (R:R)
TTA<-->CTA (L:L)
TTG<-->CTG (L:L)
TTG<-->CTG (L:L)
TTG<-->CTG (L:L)
TTG<-->CTG (L:L)

Non-Synonymous:
AAA<-->CAA (K:Q)
AAA<-->GAA (K:E)
AAA<-->GAA (K:E)
AAG<-->CAG (K:Q)
AAG<-->CAG (K:Q)
AAT<-->CAT (N:H)
AAT<-->GAT (N:D)
AAT<-->GAT (N:D)
ATT<-->CTT (I:L) *
ATT<-->GTT (I:V) *
CAT<-->GAT (H:D)
CTG<-->ATG (L:M)
CTT<-->ATT (L:I) *
GCA<-->TCA (A:S)
TAT<-->CAT (Y:H)
TCA<-->GCA (S:A)
TCT<-->CCT (S:P)
TTT<-->CTT (F:L) *


Notice that 4 out of 18 non-synonymous changes (see asterisks) fall in the category of pseudodegeneracies. That's slightly higher than the 17% we would expect by chance. So instead of 8 out of 26 first-base changes being degenerate (in the catalase genes), more like 12 out of 26 are either synonymous or near-synonymous. Bottom line, roughly half of first-base mutations are effectively synonymous, at least in this example.

Saturday, April 19, 2014

Codons and Reverse Complement Codons

A very unusual and surprising property of protein-coding genes is that if a codon A appears with a certain frequency in genes, the reverse-complement codon of A will also have a similar frequency of occurrence. For example: If CTT (leucine) appears at a frequency of 1%, the reverse complement codon AAG (lysine) will also appear at roughly 1%. If CGT (arginine) appears at 0.2%, ACG (threonine) will appear at around 0.2%. (These are whole-genome frequencies.)

This correlation is strongest (r=0.75) for organisms with a high genomic G+C content, such as Streptomyces griseus, and lowest (r=0.28) in low-GC organisms like Clostridium botulinum.

This is a very peculiar property, when you think about it. We don't usually imagine an organism being constrained in its choice of codons for a particular protein. If a particular protein calls for a huge amount of leucines (CTTCTTCTT) we don't imagine that there's a requirement for an equivalent quantity of AAG to be used somewhere else. And yet, the correlation between frequency-of-occurrence of a codon and its antisymmetric twin is, as I say, surprisingly high in many organisms.

This sort of thing is very hard to explain without invoking a theory of proteogenesis that involves antisense proteins. Imagine a poly-lysine gene of AAA repeated 100 times. The gene gets duplicated on the opposite strand. Now the original strand has 100 AAAs and a run of 100 TTTs. If a reading frame opens up on the TTT stretch (and the protein is beneficial to the organism; it survives), there is now codon/anticodon parity of the kind I'm describing, between codons in the poly-lysine gene and the poly-phenylalanine (TTT) gene.

Why does this relationship hold for high-GC organisms but not as much for low-GC organisms? Probably because antisense genes in high-AT organisms contain a lot of stop codons (TAA, TGA, TAG, which by the way occur at about the same frequencies as TTA, TCA, and CTA, respectively). The presence of few stop codons in high-GC antisense genes gives those genes a chance to be expressed and evolve further. Of course, if you buy this theory, it tends to argue for a "GC World"  scenario in which the early proteosome evolved from GC-rich double-stranded genomes.

To illustrate the unusual correlation I'm talking about, I took the codon frequencies of Pseudomonas fluorescens PF01 (genome-wide) and made a graph that plots the frequency of occurrence of each codon on the x-axis, versus the frequency of occurrence of the corresponding reverse-complement codon on the y-axis. (So if CTA occurs at 0.3% and TAG occurs at 0.2%, I plot a point at [0.3,  0.2].) The SVG graph (below) is interactive: You should be able to hover over a point and see a tooltip that shows the identity of the corresponding codon, and its reverse twin, and their respective frequencies.

NOTE: If your browser does not support SVG, a PNG copy of the graph is here.

The symmetry pattern is expected: For every codon/anticodon there's a corresponding anticodon/codon pair with frequencies swapped. What's more important than the symmetry pattern is the fact that frequency values in Y increase monotonically in X and vice versa, with a correlation coefficient in this case of r=0.63 (F-statistic 41, p < .001). This means that codons tend to occur at about the same frequencies as their reverse complement codons. There are outliers, to be sure, but the overall trend is statistically solid.

Leave a comment if you have any thoughts on what's going on here.

Tuesday, April 08, 2014

Why do so many codons begin with a purine?

With the advent of sites like genomevolution.org (where you can download genomes, create synteny graphs, run BLAST searches, and do all sorts of desktop bioinformatics), it's ridiculously easy for someone interested in comparative genomics to . . . well, compare genomes, for one thing. And if you look at enough gene sequences, a couple of things pop out.

One thing that pops out is that most codons, in most genes, begin with a purine (namely A or G: adenine or guanine). Also, codons typically show the greatest GC swing in base number three. These trends can be seen in the chart below, where I show average base composition (by codon position) for three well-studied organisms. For clarity, base-one purines are shown in bold and base-three G and C are shown highlighted in yellow.

Organism
Codon base
A
G
C
T
S. griseus
1
0.166 0.434 0.287 0.112
2
0.224 0.219 0.295 0.261
3
0.037 0.394 0.530 0.038
E. coli
1
0.256 0.343 0.238 0.161
2
0.291 0.181 0.222 0.304
3
0.186 0.285 0.261 0.265
C. botulinum
1
0.395 0.299 0.094 0.210
2
0.374 0.136 0.161 0.328
3
0.442 0.108 0.064 0.383

Streptomyces griseus is a common soil bacterium that happens to have very high genomic G+C content (72.1% overall, although you can see that in base three of codons the G+C content is more like 92%).

E. coli represents a middle-of-the-road organism in terms of G+C content (50.8% overall), while our ugly friend Clostridium botulinum (the soil organism that can ruin your whole day if it finds its way into a can of tomatoes) has very low genomic G+C content (around 28%).

Even though these organisms differ greatly in G+C content, they all illustrate the (universal) trend toward usage of purines (A or G) in the first position of a codon. Something like 59% to 69% of the time (depending on the organism), codons look like Pnn, where P is a purine base and 'n' represents any base. This is true for viruses as well as cellular genomes.

This pattern is so universal, one wonders why it exists. I think a credible, parsimonious explanation is that when protein-coding genes look like PnnPnnPnn... (etc.) it makes for a crisp reading frame. It's easy to see that a +1 frameshift results in a repeating nPn pattern and a +2 frameshift results in repeats of nnP. These are easily distinguished from Pnn.

There are benefits for a PnnPnnPnn... reading frame. In a previous post, I showed that when most of a gene's codons have a pyrimidine in base two, the resulting protein gets shipped to the cell membrane. (This is a simple consequence of the fact that codons with a pyrimidine in position two tend to code for hydrophobic, lipid-soluble amino acids.) Because a +1 reading-frame shift produces repeats of nPn, the Pnn "default" pattern means that +1 frameshifted gene products, if they occur, won't get shipped to the cell membrane. This is an extremely important outcome, because membrane proteins are, in general, highly transcribed and under strong selective pressure. In addition to specifying antigenic properties and determining phage resistance, membrane proteins make up proton pumps, secretion systems, symporters, kinases, flagellar components, and many other kinds of proteins. They determine the cell's "interface" to the world. They also maintain cell osmolarity and membrane redox potential. Messing with membrane proteins is bound to be risky. Much better to keep frameshifted nonsense proteins away from the membrane.

Fairly strong support for this notion (that Pnn codons provide a crisp reading frame) comes from studies of naturally occurring frameshift signals in DNA. We now know that in many organisms, certain "slippery" DNA signals (usually heptamers, like CCCTGAC) instruct the ribosome to change reading frames. (See, for example, "A Gripping Tale of Ribosomal Frameshifting: Extragenic Suppressors of Frameshift Mutations Spotlight P-Site Realignment," Atkins and Björk, Microbiol. Mol. Biol. Rev. 2009. Also, for fun, be sure to check out some of the papers on quadruplet decoding, which leaves room for alien life forms with 200 amino acids instead of 20.) The "slippery heptamer" frameshift signals that have thus far been identified tend to contain runs of pyrimidines.

Also tending to support the "Pnn = crisp reading frame" notion is the fact that stop codons (TGA, TAA, TAG) look like pPP (where 'p' is a pyrimidine and 'P' is a purine). Again, a crisp distinction.

As for why purines were chosen (and not pyrimidines) to begin the Xxx pattern, again I think a fairly parsimonious answer is available: ATP and GTP are the most abundant nucleoside-triphosphates in vivo. These are the energy sources for nucleic-acid and protein synthesis, respectively.

A prediction: If we run into an alien life form (in the oceans of Europa, say) and it turns out to be the case that UTP (instead of ATP) is the "universal energy molecule" in that life form's cells, then that life form's codons will probably begin with U and form Unn triplets (or Unnn quadruplets, perhaps) a higher-than-average percentage of the time.

Sunday, April 06, 2014

A binary signal in the second codon base

In looking at base composition statistics of codons, an amazing fact jumps out.

If you look carefully at the following graph, you can see that the cloud of gold-colored data points (representing the compositional stats for the second base in codons of Clostridium botulinum) has a second, "breakaway" cloud underneath it. (See arrow.)

Codon base composition statistics for Clostridium botulinum. Notice the breakaway cloud of gold points under the main cloud (arrow). These points represent genes in which most codons have a pyrimidine in the middle base of each codon.

To review: I made this plot by going through each DNA sequence for each coding ("CDS") gene of C. botulinum, and for each gene, I went through all codons and calculated the average purine content (as well as the average G+C content) of the bases at a positions one, two, and three in the codons. Thus, every dot represents the stats for one gene's worth of data.

After looking at graphs of this sort, three key facts about codon bases leap out:
  • Most codons, for most genes, have a purine as the first base (notice how the red cloud of points is higher than the others, centering on y=0.7).
  • The third base (often called the "wobble" base; shown blue here) has the most extreme G+C value. (This is well known.)
  • The middle base falls in one of two positions (high or low) on the purine (y) axis. There's a primary cloud of data points and a secondary cloud in a distinct region below the main cloud. The secondary cloud of gold points is centered at about 0.3 on the y-axis, meaning these are genes in which the second codon base tends to be a pyrimidine
The question is: What does it mean when you look at a gene with 200 or 300 or 400 codons, and the majority of codons have a pyrimidine in the second base?

If you examine the standard codon translation table (below), you can see that codons with a pyrimidine in the second position (represented by the first two columns of the table) code primarily for nonpolar amino acids. When a pyrimidine is in the second base, the possible amino acids are phenylalanine, serine, leucine, proline, isoleucine, methionine, threonine, valine, and alanine. Of these, all but serine and threonine are nonpolar. Therefore, a pyrimidine in position two of a codon means there's at least a 75% chance that the amino acid will have a nonpolar, hydrophobic side group.


Virtually all proteins contain some nonpolar amino acids, but when a protein contains mostly nonpolar amino acids, that protein is destined to wind up in the cell membrane (which is largely made up of lipids). Thus, we can expect to find that genes in the "breakaway" cloud of gold points in the graph further above represent membrane-associated proteins.

To check this hypothesis, I wrote a script that harvested the gene names, and protein-product names, of all the "breakaway cloud" data points. After purging genes annotated (unhelpfully) as "hypothetical protein," I was left with 37 "known" genes. They're shown in the following table.

Gene Product
CLC_0571 arsenical pump family protein
CLC_1058 L-lactate permease
CLC_1550 carbohydrate ABC transporter permease
CLC_3115 xanthine/uracil permease family protein
CLC_0813 arsenical-resistance protein
CLC_1018 putative anion ABC transporter, permease protein
CLC_1687 xanthine/uracil permease family protein
CLC_3633 sporulation integral membrane protein YtvI
CLC_2382 phosphate ABC transporter, permease protein PstA
CLC_0189 ZIP transporter family protein
CLC_3351 sodium:dicarboxylate symporter family protein
CLC_0971 cobalt transport protein CbiM
CLC_1534 methionine ABC transporter permease
CLC_2798 xanthine/uracil permease family protein
CLC_1397 manganese/zinc/iron chelate ABC transporter permease
CLC_1836 stage III sporulation protein AD
CLC_0528 high-affinity branched-chain amino acid ABC transporter, permease protein
CLC_0430 electron transport complex, RnfABCDGE type, A subunit
CLC_2523 flagellar biosynthesis protein FliP
CLC_0401 amino acid permease family protein
CLC_0383 lrgB-like family protein
CLC_0457 chromate transporter protein
CLC_0291 sodium:dicarboxylate symporter family protein
CLC_0427 electron transport complex, RnfABCDGE type, D subunit
CLC_1281 putative transcriptional regulator
CLC_2008 ABC transporter, permease protein
CLC_0868 branched-chain amino acid transport system II carrier protein
CLC_1237 monovalent cation:proton antiporter-2 (CPA2) family protein
CLC_1137 methionine ABC transporter permease
CLC_0764 putative drug resistance ABC-2 type transporter, permease protein
CLC_1953 xanthine/uracil permease family protein
CLC_2444 auxin efflux carrier family protein
CLC_0897 putative ABC transporter, permease protein
CLC_1555 C4-dicarboxylate transporter/malic acid transport protein
CLC_0374 xanthine/uracil permease family protein
CLC_0470 undecaprenyl pyrophosphate phosphatase
CLC_2648 monovalent cation:proton antiporter-2 (CPA2) family protein

Notice that with the exception of CLC_1281, a "putative transcriptional regulator," every gene product represents a membrane-associated protein: transporters, carrier proteins, permeases, etc.

I ran the same experiment on genes from Streptomyces griseus (strain XylbKG-1) and came up with 222 genes having high pyrimidine content in base two. All 222 genes specify membrane-associated proteins. (The full list is in a table below.)

The bottom line: Base two of codons acts as a binary switch. If the base is a pyrimidine, the associated amino acid will most likely (75% chance) be nonpolar. If the base is a purine, the codon will either be a stop codon (3 out of 32 codons) or the amino acid will be polar (26 out of 29 codons).

Here's the list of 222 genes from S. griseus in which the middle codon base is predominantly a pyrimidine:

SACT1_0608 ABC-type transporter, integral membrane subunit
SACT1_3730 major facilitator superfamily MFS_1
SACT1_4066 cation efflux protein
SACT1_5911 ABC-2 type transporter
SACT1_6577 SNARE associated protein
SACT1_6966 ABC-type transporter, integral membrane subunit
SACT1_7160 major facilitator superfamily MFS_1
SACT1_3151 NADH-ubiquinone/plastoquinone oxidoreductase chain 6
SACT1_3682 drug resistance transporter, EmrB/QacA subfamily
SACT1_5431 Citrate transporter
SACT1_3199 proton-translocating NADH-quinone oxidoreductase, chain M
SACT1_7301 putative integral membrane protein
SACT1_3198 NAD(P)H-quinone oxidoreductase subunit 2
SACT1_2008 arsenical-resistance protein
SACT1_3149 proton-translocating NADH-quinone oxidoreductase, chain L
SACT1_3967 MATE efflux family protein
SACT1_3148 proton-translocating NADH-quinone oxidoreductase, chain M
SACT1_5571 major facilitator superfamily MFS_1
SACT1_0651 major facilitator superfamily MFS_1
SACT1_2669 ABC-type transporter, integral membrane subunit
SACT1_1805 NADH dehydrogenase (quinone)
SACT1_3147 NAD(P)H-quinone oxidoreductase subunit 2
SACT1_6961 major facilitator superfamily MFS_1
SACT1_0992 ABC-2 type transporter
SACT1_2619 major facilitator superfamily MFS_1
SACT1_0507 major facilitator superfamily MFS_1
SACT1_0649 ABC-type transporter, integral membrane subunit
SACT1_0800 glycosyl transferase family 4
SACT1_1659 ABC-type transporter, integral membrane subunit
SACT1_1803 multiple resistance and pH regulation protein F
SACT1_4190 putative ABC transporter permease protein
SACT1_5522 drug resistance transporter, EmrB/QacA subfamily
SACT1_5568 drug resistance transporter, EmrB/QacA subfamily
SACT1_7248 Lysine exporter protein (LYSE/YGGA)
SACT1_0266 ABC-2 type transporter
SACT1_0847 Na+/solute symporter
SACT1_4378 ABC-type transporter, integral membrane subunit
SACT1_6766 ABC-type transporter, integral membrane subunit
SACT1_2522 putative integral membrane protein
SACT1_4762 amino acid permease-associated region
SACT1_4901 major facilitator superfamily MFS_1
SACT1_2616 multiple antibiotic resistance (MarC)-related protein
SACT1_3961 major facilitator superfamily MFS_1
SACT1_6236 MIP family channel protein
SACT1_1319 protein of unknown function UPF0016
SACT1_2332 copper resistance D domain protein
SACT1_5327 ABC-type transporter, integral membrane subunit
SACT1_5759 ABC-2 type transporter
SACT1_1133 ABC-type transporter, integral membrane subunit
SACT1_1562 ABC-type transporter, integral membrane subunit
SACT1_5518 major facilitator superfamily MFS_1
SACT1_3430 ABC-type transporter, integral membrane subunit
SACT1_4517 major facilitator superfamily MFS_1
SACT1_4565 drug resistance transporter, EmrB/QacA subfamily
SACT1_4994 ABC-type transporter, integral membrane subunit
SACT1_7197 major facilitator superfamily MFS_1
SACT1_5949 major facilitator superfamily MFS_1
SACT1_6233 protein of unknown function DUF6 transmembrane
SACT1_0936 ABC-type transporter, integral membrane subunit
SACT1_4993 2-aminoethylphosphonate ABC transporter, permease protein
SACT1_6954 ABC-type transporter, integral membrane subunit
SACT1_1846 Lysine exporter protein (LYSE/YGGA)
SACT1_3429 ABC-type transporter, integral membrane subunit
SACT1_3957 membrane protein of unknown function
SACT1_4612 ABC-2 type transporter
SACT1_1998 polar amino acid ABC transporter, inner membrane subunit
SACT1_2093 ABC-type transporter, integral membrane subunit
SACT1_4420 ABC-2 type transporter
SACT1_6613 putative ABC transporter permease protein
SACT1_2564 small multidrug resistance protein
SACT1_3669 major facilitator superfamily MFS_1
SACT1_4186 major facilitator superfamily MFS_1
SACT1_4850 ABC-type transporter, integral membrane subunit
SACT1_0206 major facilitator superfamily MFS_1
SACT1_5418 Lysine exporter protein (LYSE/YGGA)
SACT1_0548 ABC-type transporter, integral membrane subunit
SACT1_3332 protein of unknown function DUF6 transmembrane
SACT1_3764 sodium/hydrogen exchanger
SACT1_6278 protein of unknown function DUF6 transmembrane
SACT1_4143 ABC-2 type transporter
SACT1_3232 ABC-type transporter, integral membrane subunit
SACT1_0256 ABC-type transporter, integral membrane subunit
SACT1_2898 major facilitator superfamily MFS_1
SACT1_6510 protein of unknown function DUF6 transmembrane
SACT1_0980 CrcB-like protein
SACT1_1650 ABC-type transporter, integral membrane subunit
SACT1_4658 acyltransferase 3
SACT1_0306 protein of unknown function DUF6 transmembrane
SACT1_2372 major facilitator superfamily MFS_1
SACT1_7238 ABC-type transporter, integral membrane subunit
SACT1_0202 major facilitator superfamily MFS_1
SACT1_0591 ABC-type transporter, integral membrane subunit
SACT1_5369 protein of unknown function DUF81
SACT1_6227 C4-dicarboxylate transporter/malic acid transport protein
SACT1_6755 ABC-type transporter, integral membrane subunit
SACT1_0978 Urea transporter
SACT1_2418 ATP synthase subunit a
SACT1_5604 major facilitator superfamily MFS_1
SACT1_4891 ABC-type transporter, integral membrane subunit
SACT1_6507 protein of unknown function DUF6 transmembrane
SACT1_6754 ABC-type transporter, integral membrane subunit
SACT1_0787 ABC-2 type transporter
SACT1_2848 sodium/hydrogen exchanger
SACT1_0636 ABC-type transporter, integral membrane subunit
SACT1_1891 putative integral membrane protein
SACT1_1552 xanthine permease
SACT1_2894 putative secreted protein
SACT1_4508 sodium:dicarboxylate symporter
SACT1_7091 drug resistance transporter, EmrB/QacA subfamily
SACT1_2652 major facilitator superfamily MFS_1
SACT1_1741 amino acid permease-associated region
SACT1_1838 ABC-type transporter, integral membrane subunit
SACT1_2796 gluconate transporter
SACT1_5220 sodium/hydrogen exchanger
SACT1_6991 ABC-type transporter, integral membrane subunit
SACT1_3273 protein of unknown function DUF894 DitE
SACT1_7089 ABC-type transporter, integral membrane subunit
SACT1_7280 major facilitator superfamily MFS_1
SACT1_3467 major facilitator superfamily MFS_1
SACT1_1304 ABC-type transporter, integral membrane subunit
SACT1_6032 protein of unknown function DUF81
SACT1_4312 major facilitator superfamily MFS_1
SACT1_0876 ABC-type transporter, integral membrane subunit
SACT1_6123 citrate/H+ symporter, CitMHS family
SACT1_4359 Cl- channel voltage-gated family protein
SACT1_7325 branched-chain amino acid transport
SACT1_1160 protein of unknown function DUF140
SACT1_2265 Arsenical pump membrane protein
SACT1_3512 ABC-2 type transporter
SACT1_1018 major facilitator superfamily MFS_1
SACT1_3415 Xanthine/uracil/vitamin C permease
SACT1_5214 BioY protein
SACT1_3656 small multidrug resistance protein
SACT1_3895 SpdD2 protein
SACT1_4929 ABC-type transporter, integral membrane subunit
SACT1_3029 major facilitator superfamily MFS_1
SACT1_6312 ABC-type transporter, integral membrane subunit
SACT1_0919 L-lactate transport
SACT1_4356 ABC-2 type transporter
SACT1_0532 ABC-type transporter, integral membrane subunit
SACT1_6693 secretion protein snm4
SACT1_0967 ABC-type transporter, integral membrane subunit
SACT1_6496 major facilitator superfamily MFS_1
SACT1_6983 major facilitator superfamily permease
SACT1_0917 NADH-ubiquinone/plastoquinone oxidoreductase chain 3
SACT1_6887 major facilitator superfamily MFS_1
SACT1_2835 major facilitator superfamily MFS_1
SACT1_5544 drug resistance transporter, Bcr/CflA subfamily
SACT1_5591 ABC-type transporter, integral membrane subunit
SACT1_1201 ABC-type transporter, integral membrane subunit
SACT1_2404 ABC-2 type transporter
SACT1_0870 protein of unknown function DUF803
SACT1_6933 ABC-type transporter, integral membrane subunit
SACT1_1776 ABC-type transporter, integral membrane subunit
SACT1_3213 major facilitator superfamily MFS_1
SACT1_4210 phosphate ABC transporter, inner membrane subunit PstC
SACT1_5398 protein of unknown function DUF81
SACT1_0914 NADH-ubiquinone/plastoquinone oxidoreductase chain 6
SACT1_3261 major facilitator superfamily MFS_1
SACT1_6932 ABC-type transporter, integral membrane subunit
SACT1_0669 major facilitator superfamily MFS_1
SACT1_4255 ABC-2 type transporter
SACT1_4541 major facilitator superfamily MFS_1
SACT1_4638 major facilitator superfamily MFS_1
SACT1_0913 NADH-ubiquinone oxidoreductase chain 4L
SACT1_4443 protein of unknown function DUF81
SACT1_5396 Xanthine/uracil/vitamin C permease
SACT1_0912 NADH dehydrogenase (quinone)
SACT1_2924 Lysine exporter protein (LYSE/YGGA)
SACT1_5922 putative integral membrane protein
SACT1_1243 Bile acid:sodium symporter
SACT1_6967 ABC-type transporter, integral membrane subunit
SACT1_0911 proton-translocating NADH-quinone oxidoreductase, chain M
SACT1_3931 putative ABC transporter permease protein
SACT1_2820 ABC-2 type transporter
SACT1_3298 putative integral membrane transport protein
SACT1_4871 major facilitator superfamily MFS_1
SACT1_5873 major facilitator superfamily MFS_1
SACT1_6636 putative integral membrane protein
SACT1_0905 AbgT transporter
SACT1_5532 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase
SACT1_0910 proton-translocating NADH-quinone oxidoreductase, chain N
SACT1_2115 ABC-type transporter, integral membrane subunit
SACT1_4635 protein of unknown function DUF6 transmembrane
SACT1_5777 major facilitator superfamily MFS_1
SACT1_3979 major facilitator superfamily MFS_1
SACT1_5536 major facilitator superfamily MFS_1
SACT1_6782 major facilitator superfamily MFS_1
SACT1_0616 virulence factor MVIN family protein
SACT1_4869 ABC-type transporter, integral membrane subunit
SACT1_1581 putative integral membrane protein
SACT1_4585 major facilitator superfamily MFS_1
SACT1_6536 small multidrug resistance protein
SACT1_4024 cell cycle protein
SACT1_5296 major facilitator superfamily MFS_1
SACT1_1865 major facilitator superfamily MFS_1
SACT1_4868 ABC-type transporter, integral membrane subunit
SACT1_0955 protein of unknown function DUF803
SACT1_4296 major facilitator superfamily MFS_1
SACT1_5104 major facilitator superfamily MFS_1
SACT1_0519 Bile acid:sodium symporter
SACT1_2394 putative ABC transporter permease protein
SACT1_0661 major facilitator superfamily MFS_1
SACT1_2062 major facilitator superfamily MFS_1
SACT1_4295 ABC-type transporter, integral membrane subunit
SACT1_6828 peptidase M48 Ste24p
SACT1_3446 major facilitator superfamily MFS_1
SACT1_6631 Lysine exporter protein (LYSE/YGGA)
SACT1_1048 ABC-type transporter, integral membrane subunit
SACT1_1528 protein of unknown function DUF6 transmembrane
SACT1_2016 branched-chain amino acid transport
SACT1_6154 gluconate transporter
SACT1_5051 major facilitator superfamily MFS_1
SACT1_5531 protein of unknown function DUF81
SACT1_6480 protein of unknown function DUF1290
SACT1_0373 ABC-type transporter, integral membrane subunit
SACT1_2392 putative ABC transporter permease protein
SACT1_2724 protein of unknown function DUF107
SACT1_7257 protein of unknown function UPF0118
SACT1_2772 putative integral membrane protein
SACT1_3201 NAD(P)H-quinone oxidoreductase subunit 4L
SACT1_7066 ABC-type transporter, integral membrane subunit

Some of these genes are labeled "protein of unknown function," but I think we can predict with high confidence, based on what we know about these proteins (namely, that they're hydrophobic) that the gene products in question involve membrane-associated functions.

Bioinformatics geeks, leave a comment below.