Jump to content

Granularity

(Redirected from Data granularity)

Granularity (also called graininess) is the degree to which a material or system is composed of distinguishable pieces, "granules" or "grains" (metaphorically). It can either refer to the extent to which a larger entity is subdivided, or the extent to which groups of smaller indistinguishable entities have joined together to become larger distinguishable entities.

Precision and ambiguity

Coarse-grained materials or systems have fewer, larger discrete components than fine-grained materials or systems.

  • A coarse-grained description of a system regards large subcomponents.
  • A fine-grained description regards smaller components of which the larger ones are composed.

The concepts granularity, coarseness, and fineness are relative; and are used when comparing systems or descriptions of systems. An example of increasingly fine granularity: a list of nations in the United Nations, a list of all states/provinces in those nations, a list of all cities in those states, etc.

Physics

A fine-grained description of a system is a detailed, exhaustive, low-level model of it. A coarse-grained description is a model where some of this fine detail has been smoothed over or averaged out. The replacement of a fine-grained description with a lower-resolution coarse-grained model is called coarse-graining. (See for example the second law of thermodynamics)

Molecular dynamics

In molecular dynamics, coarse graining consists of replacing an atomistic description of a biological molecule with a lower-resolution coarse-grained model that averages or smooths away fine details.

Coarse-grained models have been developed for investigating the longer time- and length-scale dynamics that are critical to many biological processes, such as lipid membranes and proteins.[1] These concepts not only apply to biological molecules but also inorganic molecules.

Coarse graining may remove certain degrees of freedom, such as the vibrational modes between two atoms, or represent the two atoms as a single particle. The ends to which systems may be coarse-grained is simply bound by the accuracy in the dynamics and structural properties one wishes to replicate. This modern area of research is in its infancy, and although it is commonly used in biological modeling, the analytic theory behind it is poorly understood.

Computing

In parallel computing, granularity means the amount of computation in relation to communication, i.e., the ratio of computation to the amount of communication.[2]

Fine-grained parallelism means individual tasks are relatively small in terms of code size and execution time. The data is transferred among processors frequently in amounts of one or a few memory words. Coarse-grained is the opposite: data is communicated infrequently, after larger amounts of computation.

The finer the granularity, the greater the potential for parallelism and hence speed-up, but the greater the overheads of synchronization and communication.[3] Granularity disintegrators exist as well and are important to understand in order to determine the accurate level of granularity.[4]

In order to attain the best parallel performance, the best balance between load and communication overhead needs to be found. If the granularity is too fine, the performance can suffer from the increased communication overhead. On the other side, if the granularity is too coarse, the performance can suffer from load imbalance.

Reconfigurable computing and supercomputing

In reconfigurable computing and in supercomputing these terms refer to the data path width. The use of about one-bit wide processing elements like the configurable logic blocks (CLBs) in an FPGA is called fine-grained computing or fine-grained reconfigurability, whereas using wide data paths, such as, for instance, 32 bits wide resources, like microprocessor CPUs or data-stream-driven data path units (DPUs) like in a reconfigurable datapath array (rDPA) is called coarse-grained computing or coarse-grained reconfigurability.

Data and information

The granularity of data refers to the size in which data fields are sub-divided. For example, a postal address can be recorded, with coarse granularity, as a single field:

  1. address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA

or with fine granularity, as multiple fields:

  1. street address = 200 2nd Ave. South #358
  2. city = St. Petersburg
  3. state = FL
  4. postal code = 33701-4313
  5. country = USA

or even finer granularity:

  1. street = 2nd Ave. South
  2. address number = 200
  3. suite/apartment = #358
  4. city = St. Petersburg
  5. state = FL
  6. postal-code = 33701
  7. postal-code-add-on = 4313
  8. country = USA

Finer granularity has overheads for data input and storage. This manifests itself in a higher number of objects and methods in the object-oriented programming paradigm or more subroutine calls for procedural programming and parallel computing environments. It does however offer benefits in flexibility of data processing in treating each data field in isolation if required. A performance problem caused by excessive granularity may not reveal itself until scalability becomes an issue.

Within database design and data warehouse design, data grain can also refer to the smallest combination of columns in a table which makes the rows (also called records) unique.[5]

See also

Notes

  1. ^ Kmiecik, S.; Gront, D.; Kolinski, M.; Wieteska, L.; Dawid, A. E.; Kolinski, A. (2016). "Coarse-Grained Protein Models and Their Applications". Chemical Reviews. 116 (14): 7898–936. doi:10.1021/acs.chemrev.6b00163. PMID 27333362.
  2. ^ Spacey et al. 2012.
  3. ^ FOLDOC
  4. ^ "Software Architecture: The Hard Parts". Thoughtworks. Retrieved 2023-01-15.
  5. ^ Data grain: What granularity means in terms of data modeling

References

See what we do next...

OR

By submitting your email or phone number, you're giving mschf permission to send you email and/or recurring marketing texts. Data rates may apply. Text stop to cancel, help for help.

Success: You're subscribed now !